1==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :depth: 4
8
9Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55    %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76   prefix. For example, ``%foo``, ``@DivisionByZero``,
77   ``%a.really.long.identifier``. The actual regular expression used is
78   '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
79   characters in their names can be surrounded with quotes. Special
80   characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81   code for the character in hexadecimal. In this way, any character can
82   be used in a name value, even quotes themselves. The ``"\01"`` prefix
83   can be used on global values to suppress mangling.
84#. Unnamed values are represented as an unsigned numeric value with
85   their prefix. For example, ``%12``, ``@2``, ``%44``.
86#. Constants, which are described in the section Constants_ below.
87
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109    %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
115    %result = shl i32 %X, 3
116
117And the hard way:
118
119.. code-block:: llvm
120
121    %0 = add i32 %X, %X           ; yields i32:%0
122    %1 = add i32 %0, %0           ; yields i32:%1
123    %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130   not assigned to a named value.
131#. Unnamed temporaries are numbered sequentially (using a per-function
132   incrementing counter, starting with 0). Note that basic blocks and unnamed
133   function parameters are included in this numbering. For example, if the
134   entry basic block is not given a label name and all function parameters are
135   named, then it will get number 0.
136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
156    ; Declare the string constant as a global constant.
157    @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
158
159    ; External declaration of the puts function
160    declare i32 @puts(i8* nocapture) nounwind
161
162    ; Definition of main function
163    define i32 @main() {   ; i32()*
164      ; Convert [13 x i8]* to i8*...
165      %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
166
167      ; Call puts function to write out the string to stdout.
168      call i32 @puts(i8* %cast210)
169      ret i32 0
170    }
171
172    ; Named metadata
173    !0 = !{i32 42, null, !"string"}
174    !foo = !{!0}
175
176This example is made up of a :ref:`global variable <globalvars>` named
177"``.str``", an external declaration of the "``puts``" function, a
178:ref:`function definition <functionstructure>` for "``main``" and
179:ref:`named metadata <namedmetadatastructure>` "``foo``".
180
181In general, a module is made up of a list of global values (where both
182functions and global variables are global values). Global values are
183represented by a pointer to a memory location (in this case, a pointer
184to an array of char, and a pointer to a function), and have one of the
185following :ref:`linkage types <linkage>`.
186
187.. _linkage:
188
189Linkage Types
190-------------
191
192All Global Variables and Functions have one of the following types of
193linkage:
194
195``private``
196    Global values with "``private``" linkage are only directly
197    accessible by objects in the current module. In particular, linking
198    code into a module with a private global value may cause the
199    private to be renamed as necessary to avoid collisions. Because the
200    symbol is private to the module, all references can be updated. This
201    doesn't show up in any symbol table in the object file.
202``internal``
203    Similar to private, but the value shows as a local symbol
204    (``STB_LOCAL`` in the case of ELF) in the object file. This
205    corresponds to the notion of the '``static``' keyword in C.
206``available_externally``
207    Globals with "``available_externally``" linkage are never emitted into
208    the object file corresponding to the LLVM module. From the linker's
209    perspective, an ``available_externally`` global is equivalent to
210    an external declaration. They exist to allow inlining and other
211    optimizations to take place given knowledge of the definition of the
212    global, which is known to be somewhere outside the module. Globals
213    with ``available_externally`` linkage are allowed to be discarded at
214    will, and allow inlining and other optimizations. This linkage type is
215    only allowed on definitions, not declarations.
216``linkonce``
217    Globals with "``linkonce``" linkage are merged with other globals of
218    the same name when linkage occurs. This can be used to implement
219    some forms of inline functions, templates, or other code which must
220    be generated in each translation unit that uses it, but where the
221    body may be overridden with a more definitive definition later.
222    Unreferenced ``linkonce`` globals are allowed to be discarded. Note
223    that ``linkonce`` linkage does not actually allow the optimizer to
224    inline the body of this function into callers because it doesn't
225    know if this definition of the function is the definitive definition
226    within the program or whether it will be overridden by a stronger
227    definition. To enable inlining and other optimizations, use
228    "``linkonce_odr``" linkage.
229``weak``
230    "``weak``" linkage has the same merging semantics as ``linkonce``
231    linkage, except that unreferenced globals with ``weak`` linkage may
232    not be discarded. This is used for globals that are declared "weak"
233    in C source code.
234``common``
235    "``common``" linkage is most similar to "``weak``" linkage, but they
236    are used for tentative definitions in C, such as "``int X;``" at
237    global scope. Symbols with "``common``" linkage are merged in the
238    same way as ``weak symbols``, and they may not be deleted if
239    unreferenced. ``common`` symbols may not have an explicit section,
240    must have a zero initializer, and may not be marked
241    ':ref:`constant <globalvars>`'. Functions and aliases may not have
242    common linkage.
243
244.. _linkage_appending:
245
246``appending``
247    "``appending``" linkage may only be applied to global variables of
248    pointer to array type. When two global variables with appending
249    linkage are linked together, the two global arrays are appended
250    together. This is the LLVM, typesafe, equivalent of having the
251    system linker append together "sections" with identical names when
252    .o files are linked.
253
254    Unfortunately this doesn't correspond to any feature in .o files, so it
255    can only be used for variables like ``llvm.global_ctors`` which llvm
256    interprets specially.
257
258``extern_weak``
259    The semantics of this linkage follow the ELF object file model: the
260    symbol is weak until linked, if not linked, the symbol becomes null
261    instead of being an undefined reference.
262``linkonce_odr``, ``weak_odr``
263    Some languages allow differing globals to be merged, such as two
264    functions with different semantics. Other languages, such as
265    ``C++``, ensure that only equivalent globals are ever merged (the
266    "one definition rule" --- "ODR"). Such languages can use the
267    ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
268    global will only be merged with equivalent globals. These linkage
269    types are otherwise the same as their non-``odr`` versions.
270``external``
271    If none of the above identifiers are used, the global is externally
272    visible, meaning that it participates in linkage and can be used to
273    resolve external symbol references.
274
275It is illegal for a global variable or function *declaration* to have any
276linkage type other than ``external`` or ``extern_weak``.
277
278.. _callingconv:
279
280Calling Conventions
281-------------------
282
283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
284:ref:`invokes <i_invoke>` can all have an optional calling convention
285specified for the call. The calling convention of any pair of dynamic
286caller/callee must match, or the behavior of the program is undefined.
287The following calling conventions are supported by LLVM, and more may be
288added in the future:
289
290"``ccc``" - The C calling convention
291    This calling convention (the default if no other calling convention
292    is specified) matches the target C calling conventions. This calling
293    convention supports varargs function calls and tolerates some
294    mismatch in the declared prototype and implemented declaration of
295    the function (as does normal C).
296"``fastcc``" - The fast calling convention
297    This calling convention attempts to make calls as fast as possible
298    (e.g. by passing things in registers). This calling convention
299    allows the target to use whatever tricks it wants to produce fast
300    code for the target, without having to conform to an externally
301    specified ABI (Application Binary Interface). `Tail calls can only
302    be optimized when this, the tailcc, the GHC or the HiPE convention is
303    used. <CodeGenerator.html#id80>`_ This calling convention does not
304    support varargs and requires the prototype of all callees to exactly
305    match the prototype of the function definition.
306"``coldcc``" - The cold calling convention
307    This calling convention attempts to make code in the caller as
308    efficient as possible under the assumption that the call is not
309    commonly executed. As such, these calls often preserve all registers
310    so that the call does not break any live ranges in the caller side.
311    This calling convention does not support varargs and requires the
312    prototype of all callees to exactly match the prototype of the
313    function definition. Furthermore the inliner doesn't consider such function
314    calls for inlining.
315"``cc 10``" - GHC convention
316    This calling convention has been implemented specifically for use by
317    the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
318    It passes everything in registers, going to extremes to achieve this
319    by disabling callee save registers. This calling convention should
320    not be used lightly but only for specific situations such as an
321    alternative to the *register pinning* performance technique often
322    used when implementing functional programming languages. At the
323    moment only X86 supports this convention and it has the following
324    limitations:
325
326    -  On *X86-32* only supports up to 4 bit type parameters. No
327       floating-point types are supported.
328    -  On *X86-64* only supports up to 10 bit type parameters and 6
329       floating-point parameters.
330
331    This calling convention supports `tail call
332    optimization <CodeGenerator.html#id80>`_ but requires both the
333    caller and callee are using it.
334"``cc 11``" - The HiPE calling convention
335    This calling convention has been implemented specifically for use by
336    the `High-Performance Erlang
337    (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
338    native code compiler of the `Ericsson's Open Source Erlang/OTP
339    system <http://www.erlang.org/download.shtml>`_. It uses more
340    registers for argument passing than the ordinary C calling
341    convention and defines no callee-saved registers. The calling
342    convention properly supports `tail call
343    optimization <CodeGenerator.html#id80>`_ but requires that both the
344    caller and the callee use it. It uses a *register pinning*
345    mechanism, similar to GHC's convention, for keeping frequently
346    accessed runtime components pinned to specific hardware registers.
347    At the moment only X86 supports this convention (both 32 and 64
348    bit).
349"``webkit_jscc``" - WebKit's JavaScript calling convention
350    This calling convention has been implemented for `WebKit FTL JIT
351    <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
352    stack right to left (as cdecl does), and returns a value in the
353    platform's customary return register.
354"``anyregcc``" - Dynamic calling convention for code patching
355    This is a special convention that supports patching an arbitrary code
356    sequence in place of a call site. This convention forces the call
357    arguments into registers but allows them to be dynamically
358    allocated. This can currently only be used with calls to
359    llvm.experimental.patchpoint because only this intrinsic records
360    the location of its arguments in a side table. See :doc:`StackMaps`.
361"``preserve_mostcc``" - The `PreserveMost` calling convention
362    This calling convention attempts to make the code in the caller as
363    unintrusive as possible. This convention behaves identically to the `C`
364    calling convention on how arguments and return values are passed, but it
365    uses a different set of caller/callee-saved registers. This alleviates the
366    burden of saving and recovering a large register set before and after the
367    call in the caller. If the arguments are passed in callee-saved registers,
368    then they will be preserved by the callee across the call. This doesn't
369    apply for values returned in callee-saved registers.
370
371    - On X86-64 the callee preserves all general purpose registers, except for
372      R11. R11 can be used as a scratch register. Floating-point registers
373      (XMMs/YMMs) are not preserved and need to be saved by the caller.
374
375    The idea behind this convention is to support calls to runtime functions
376    that have a hot path and a cold path. The hot path is usually a small piece
377    of code that doesn't use many registers. The cold path might need to call out to
378    another function and therefore only needs to preserve the caller-saved
379    registers, which haven't already been saved by the caller. The
380    `PreserveMost` calling convention is very similar to the `cold` calling
381    convention in terms of caller/callee-saved registers, but they are used for
382    different types of function calls. `coldcc` is for function calls that are
383    rarely executed, whereas `preserve_mostcc` function calls are intended to be
384    on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
385    doesn't prevent the inliner from inlining the function call.
386
387    This calling convention will be used by a future version of the ObjectiveC
388    runtime and should therefore still be considered experimental at this time.
389    Although this convention was created to optimize certain runtime calls to
390    the ObjectiveC runtime, it is not limited to this runtime and might be used
391    by other runtimes in the future too. The current implementation only
392    supports X86-64, but the intention is to support more architectures in the
393    future.
394"``preserve_allcc``" - The `PreserveAll` calling convention
395    This calling convention attempts to make the code in the caller even less
396    intrusive than the `PreserveMost` calling convention. This calling
397    convention also behaves identical to the `C` calling convention on how
398    arguments and return values are passed, but it uses a different set of
399    caller/callee-saved registers. This removes the burden of saving and
400    recovering a large register set before and after the call in the caller. If
401    the arguments are passed in callee-saved registers, then they will be
402    preserved by the callee across the call. This doesn't apply for values
403    returned in callee-saved registers.
404
405    - On X86-64 the callee preserves all general purpose registers, except for
406      R11. R11 can be used as a scratch register. Furthermore it also preserves
407      all floating-point registers (XMMs/YMMs).
408
409    The idea behind this convention is to support calls to runtime functions
410    that don't need to call out to any other functions.
411
412    This calling convention, like the `PreserveMost` calling convention, will be
413    used by a future version of the ObjectiveC runtime and should be considered
414    experimental at this time.
415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
416    Clang generates an access function to access C++-style TLS. The access
417    function generally has an entry block, an exit block and an initialization
418    block that is run at the first time. The entry and exit blocks can access
419    a few TLS IR variables, each access will be lowered to a platform-specific
420    sequence.
421
422    This calling convention aims to minimize overhead in the caller by
423    preserving as many registers as possible (all the registers that are
424    preserved on the fast path, composed of the entry and exit blocks).
425
426    This calling convention behaves identical to the `C` calling convention on
427    how arguments and return values are passed, but it uses a different set of
428    caller/callee-saved registers.
429
430    Given that each platform has its own lowering sequence, hence its own set
431    of preserved registers, we can't use the existing `PreserveMost`.
432
433    - On X86-64 the callee preserves all general purpose registers, except for
434      RDI and RAX.
435"``tailcc``" - Tail callable calling convention
436    This calling convention ensures that calls in tail position will always be
437    tail call optimized. This calling convention is equivalent to fastcc,
438    except for an additional guarantee that tail calls will be produced
439    whenever possible. `Tail calls can only be optimized when this, the fastcc,
440    the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This
441    calling convention does not support varargs and requires the prototype of
442    all callees to exactly match the prototype of the function definition.
443"``swiftcc``" - This calling convention is used for Swift language.
444    - On X86-64 RCX and R8 are available for additional integer returns, and
445      XMM2 and XMM3 are available for additional FP/vector returns.
446    - On iOS platforms, we use AAPCS-VFP calling convention.
447"``swifttailcc``"
448    This calling convention is like ``swiftcc`` in most respects, but also the
449    callee pops the argument area of the stack so that mandatory tail calls are
450    possible as in ``tailcc``.
451"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
452    This calling convention is used for the Control Flow Guard check function,
453    calls to which can be inserted before indirect calls to check that the call
454    target is a valid function address. The check function has no return value,
455    but it will trigger an OS-level error if the address is not a valid target.
456    The set of registers preserved by the check function, and the register
457    containing the target address are architecture-specific.
458
459    - On X86 the target address is passed in ECX.
460    - On ARM the target address is passed in R0.
461    - On AArch64 the target address is passed in X15.
462"``cc <n>``" - Numbered convention
463    Any calling convention may be specified by number, allowing
464    target-specific calling conventions to be used. Target specific
465    calling conventions start at 64.
466
467More calling conventions can be added/defined on an as-needed basis, to
468support Pascal conventions or any other well-known target-independent
469convention.
470
471.. _visibilitystyles:
472
473Visibility Styles
474-----------------
475
476All Global Variables and Functions have one of the following visibility
477styles:
478
479"``default``" - Default style
480    On targets that use the ELF object file format, default visibility
481    means that the declaration is visible to other modules and, in
482    shared libraries, means that the declared entity may be overridden.
483    On Darwin, default visibility means that the declaration is visible
484    to other modules. Default visibility corresponds to "external
485    linkage" in the language.
486"``hidden``" - Hidden style
487    Two declarations of an object with hidden visibility refer to the
488    same object if they are in the same shared object. Usually, hidden
489    visibility indicates that the symbol will not be placed into the
490    dynamic symbol table, so no other module (executable or shared
491    library) can reference it directly.
492"``protected``" - Protected style
493    On ELF, protected visibility indicates that the symbol will be
494    placed in the dynamic symbol table, but that references within the
495    defining module will bind to the local symbol. That is, the symbol
496    cannot be overridden by another module.
497
498A symbol with ``internal`` or ``private`` linkage must have ``default``
499visibility.
500
501.. _dllstorageclass:
502
503DLL Storage Classes
504-------------------
505
506All Global Variables, Functions and Aliases can have one of the following
507DLL storage class:
508
509``dllimport``
510    "``dllimport``" causes the compiler to reference a function or variable via
511    a global pointer to a pointer that is set up by the DLL exporting the
512    symbol. On Microsoft Windows targets, the pointer name is formed by
513    combining ``__imp_`` and the function or variable name.
514``dllexport``
515    "``dllexport``" causes the compiler to provide a global pointer to a pointer
516    in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
517    Microsoft Windows targets, the pointer name is formed by combining
518    ``__imp_`` and the function or variable name. Since this storage class
519    exists for defining a dll interface, the compiler, assembler and linker know
520    it is externally referenced and must refrain from deleting the symbol.
521
522.. _tls_model:
523
524Thread Local Storage Models
525---------------------------
526
527A variable may be defined as ``thread_local``, which means that it will
528not be shared by threads (each thread will have a separated copy of the
529variable). Not all targets support thread-local variables. Optionally, a
530TLS model may be specified:
531
532``localdynamic``
533    For variables that are only used within the current shared library.
534``initialexec``
535    For variables in modules that will not be loaded dynamically.
536``localexec``
537    For variables defined in the executable and only used within it.
538
539If no explicit model is given, the "general dynamic" model is used.
540
541The models correspond to the ELF TLS models; see `ELF Handling For
542Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
543more information on under which circumstances the different models may
544be used. The target may choose a different TLS model if the specified
545model is not supported, or if a better choice of model can be made.
546
547A model can also be specified in an alias, but then it only governs how
548the alias is accessed. It will not have any effect in the aliasee.
549
550For platforms without linker support of ELF TLS model, the -femulated-tls
551flag can be used to generate GCC compatible emulated TLS code.
552
553.. _runtime_preemption_model:
554
555Runtime Preemption Specifiers
556-----------------------------
557
558Global variables, functions and aliases may have an optional runtime preemption
559specifier. If a preemption specifier isn't given explicitly, then a
560symbol is assumed to be ``dso_preemptable``.
561
562``dso_preemptable``
563    Indicates that the function or variable may be replaced by a symbol from
564    outside the linkage unit at runtime.
565
566``dso_local``
567    The compiler may assume that a function or variable marked as ``dso_local``
568    will resolve to a symbol within the same linkage unit. Direct access will
569    be generated even if the definition is not within this compilation unit.
570
571.. _namedtypes:
572
573Structure Types
574---------------
575
576LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
577types <t_struct>`. Literal types are uniqued structurally, but identified types
578are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
579to forward declare a type that is not yet available.
580
581An example of an identified structure specification is:
582
583.. code-block:: llvm
584
585    %mytype = type { %mytype*, i32 }
586
587Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
588literal types are uniqued in recent versions of LLVM.
589
590.. _nointptrtype:
591
592Non-Integral Pointer Type
593-------------------------
594
595Note: non-integral pointer types are a work in progress, and they should be
596considered experimental at this time.
597
598LLVM IR optionally allows the frontend to denote pointers in certain address
599spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
600Non-integral pointer types represent pointers that have an *unspecified* bitwise
601representation; that is, the integral representation may be target dependent or
602unstable (not backed by a fixed integer).
603
604``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
605integral (i.e. normal) pointers in that they convert integers to and from
606corresponding pointer types, but there are additional implications to be
607aware of.  Because the bit-representation of a non-integral pointer may
608not be stable, two identical casts of the same operand may or may not
609return the same value.  Said differently, the conversion to or from the
610non-integral type depends on environmental state in an implementation
611defined manner.
612
613If the frontend wishes to observe a *particular* value following a cast, the
614generated IR must fence with the underlying environment in an implementation
615defined manner. (In practice, this tends to require ``noinline`` routines for
616such operations.)
617
618From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
619non-integral types are analogous to ones on integral types with one
620key exception: the optimizer may not, in general, insert new dynamic
621occurrences of such casts.  If a new cast is inserted, the optimizer would
622need to either ensure that a) all possible values are valid, or b)
623appropriate fencing is inserted.  Since the appropriate fencing is
624implementation defined, the optimizer can't do the latter.  The former is
625challenging as many commonly expected properties, such as
626``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
627
628.. _globalvars:
629
630Global Variables
631----------------
632
633Global variables define regions of memory allocated at compilation time
634instead of run-time.
635
636Global variable definitions must be initialized.
637
638Global variables in other translation units can also be declared, in which
639case they don't have an initializer.
640
641Global variables can optionally specify a :ref:`linkage type <linkage>`.
642
643Either global variable definitions or declarations may have an explicit section
644to be placed in and may have an optional explicit alignment specified. If there
645is a mismatch between the explicit or inferred section information for the
646variable declaration and its definition the resulting behavior is undefined.
647
648A variable may be defined as a global ``constant``, which indicates that
649the contents of the variable will **never** be modified (enabling better
650optimization, allowing the global data to be placed in the read-only
651section of an executable, etc). Note that variables that need runtime
652initialization cannot be marked ``constant`` as there is a store to the
653variable.
654
655LLVM explicitly allows *declarations* of global variables to be marked
656constant, even if the final definition of the global is not. This
657capability can be used to enable slightly better optimization of the
658program, but requires the language definition to guarantee that
659optimizations based on the 'constantness' are valid for the translation
660units that do not include the definition.
661
662As SSA values, global variables define pointer values that are in scope
663(i.e. they dominate) all basic blocks in the program. Global variables
664always define a pointer to their "content" type because they describe a
665region of memory, and all memory objects in LLVM are accessed through
666pointers.
667
668Global variables can be marked with ``unnamed_addr`` which indicates
669that the address is not significant, only the content. Constants marked
670like this can be merged with other constants if they have the same
671initializer. Note that a constant with significant address *can* be
672merged with a ``unnamed_addr`` constant, the result being a constant
673whose address is significant.
674
675If the ``local_unnamed_addr`` attribute is given, the address is known to
676not be significant within the module.
677
678A global variable may be declared to reside in a target-specific
679numbered address space. For targets that support them, address spaces
680may affect how optimizations are performed and/or what target
681instructions are used to access the variable. The default address space
682is zero. The address space qualifier must precede any other attributes.
683
684LLVM allows an explicit section to be specified for globals. If the
685target supports it, it will emit globals to the section specified.
686Additionally, the global can placed in a comdat if the target has the necessary
687support.
688
689External declarations may have an explicit section specified. Section
690information is retained in LLVM IR for targets that make use of this
691information. Attaching section information to an external declaration is an
692assertion that its definition is located in the specified section. If the
693definition is located in a different section, the behavior is undefined.
694
695By default, global initializers are optimized by assuming that global
696variables defined within the module are not modified from their
697initial values before the start of the global initializer. This is
698true even for variables potentially accessible from outside the
699module, including those with external linkage or appearing in
700``@llvm.used`` or dllexported variables. This assumption may be suppressed
701by marking the variable with ``externally_initialized``.
702
703An explicit alignment may be specified for a global, which must be a
704power of 2. If not present, or if the alignment is set to zero, the
705alignment of the global is set by the target to whatever it feels
706convenient. If an explicit alignment is specified, the global is forced
707to have exactly that alignment. Targets and optimizers are not allowed
708to over-align the global if the global has an assigned section. In this
709case, the extra alignment could be observable: for example, code could
710assume that the globals are densely packed in their section and try to
711iterate over them as an array, alignment padding would break this
712iteration. The maximum alignment is ``1 << 32``.
713
714For global variables declarations, as well as definitions that may be
715replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
716linkage types), LLVM makes no assumptions about the allocation size of the
717variables, except that they may not overlap. The alignment of a global variable
718declaration or replaceable definition must not be greater than the alignment of
719the definition it resolves to.
720
721Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
722an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
723an optional :ref:`global attributes <glattrs>` and
724an optional list of attached :ref:`metadata <metadata>`.
725
726Variables and aliases can have a
727:ref:`Thread Local Storage Model <tls_model>`.
728
729:ref:`Scalable vectors <t_vector>` cannot be global variables or members of
730arrays because their size is unknown at compile time. They are allowed in
731structs to facilitate intrinsics returning multiple values. Structs containing
732scalable vectors cannot be used in loads, stores, allocas, or GEPs.
733
734Syntax::
735
736      @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
737                         [DLLStorageClass] [ThreadLocal]
738                         [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
739                         [ExternallyInitialized]
740                         <global | constant> <Type> [<InitializerConstant>]
741                         [, section "name"] [, comdat [($name)]]
742                         [, align <Alignment>] (, !name !N)*
743
744For example, the following defines a global in a numbered address space
745with an initializer, section, and alignment:
746
747.. code-block:: llvm
748
749    @G = addrspace(5) constant float 1.0, section "foo", align 4
750
751The following example just declares a global variable
752
753.. code-block:: llvm
754
755   @G = external global i32
756
757The following example defines a thread-local global with the
758``initialexec`` TLS model:
759
760.. code-block:: llvm
761
762    @G = thread_local(initialexec) global i32 0, align 4
763
764.. _functionstructure:
765
766Functions
767---------
768
769LLVM function definitions consist of the "``define``" keyword, an
770optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
771specifier <runtime_preemption_model>`,  an optional :ref:`visibility
772style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
773an optional :ref:`calling convention <callingconv>`,
774an optional ``unnamed_addr`` attribute, a return type, an optional
775:ref:`parameter attribute <paramattrs>` for the return type, a function
776name, a (possibly empty) argument list (each with optional :ref:`parameter
777attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
778an optional address space, an optional section, an optional alignment,
779an optional :ref:`comdat <langref_comdats>`,
780an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
781an optional :ref:`prologue <prologuedata>`,
782an optional :ref:`personality <personalityfn>`,
783an optional list of attached :ref:`metadata <metadata>`,
784an opening curly brace, a list of basic blocks, and a closing curly brace.
785
786LLVM function declarations consist of the "``declare``" keyword, an
787optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
788<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
789optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
790or ``local_unnamed_addr`` attribute, an optional address space, a return type,
791an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
792empty list of arguments, an optional alignment, an optional :ref:`garbage
793collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
794:ref:`prologue <prologuedata>`.
795
796A function definition contains a list of basic blocks, forming the CFG (Control
797Flow Graph) for the function. Each basic block may optionally start with a label
798(giving the basic block a symbol table entry), contains a list of instructions,
799and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
800function return). If an explicit label name is not provided, a block is assigned
801an implicit numbered label, using the next value from the same counter as used
802for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
803function entry block does not have an explicit label, it will be assigned label
804"%0", then the first unnamed temporary in that block will be "%1", etc. If a
805numeric label is explicitly specified, it must match the numeric label that
806would be used implicitly.
807
808The first basic block in a function is special in two ways: it is
809immediately executed on entrance to the function, and it is not allowed
810to have predecessor basic blocks (i.e. there can not be any branches to
811the entry block of a function). Because the block can have no
812predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
813
814LLVM allows an explicit section to be specified for functions. If the
815target supports it, it will emit functions to the section specified.
816Additionally, the function can be placed in a COMDAT.
817
818An explicit alignment may be specified for a function. If not present,
819or if the alignment is set to zero, the alignment of the function is set
820by the target to whatever it feels convenient. If an explicit alignment
821is specified, the function is forced to have at least that much
822alignment. All alignments must be a power of 2.
823
824If the ``unnamed_addr`` attribute is given, the address is known to not
825be significant and two identical functions can be merged.
826
827If the ``local_unnamed_addr`` attribute is given, the address is known to
828not be significant within the module.
829
830If an explicit address space is not given, it will default to the program
831address space from the :ref:`datalayout string<langref_datalayout>`.
832
833Syntax::
834
835    define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
836           [cconv] [ret attrs]
837           <ResultType> @<FunctionName> ([argument list])
838           [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
839           [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant]
840           [prologue Constant] [personality Constant] (!name !N)* { ... }
841
842The argument list is a comma separated sequence of arguments where each
843argument is of the following form:
844
845Syntax::
846
847   <type> [parameter Attrs] [name]
848
849
850.. _langref_aliases:
851
852Aliases
853-------
854
855Aliases, unlike function or variables, don't create any new data. They
856are just a new symbol and metadata for an existing position.
857
858Aliases have a name and an aliasee that is either a global value or a
859constant expression.
860
861Aliases may have an optional :ref:`linkage type <linkage>`, an optional
862:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
863:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
864<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
865
866Syntax::
867
868    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
869
870The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
871``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
872might not correctly handle dropping a weak symbol that is aliased.
873
874Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
875the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
876to the same content.
877
878If the ``local_unnamed_addr`` attribute is given, the address is known to
879not be significant within the module.
880
881Since aliases are only a second name, some restrictions apply, of which
882some can only be checked when producing an object file:
883
884* The expression defining the aliasee must be computable at assembly
885  time. Since it is just a name, no relocations can be used.
886
887* No alias in the expression can be weak as the possibility of the
888  intermediate alias being overridden cannot be represented in an
889  object file.
890
891* No global value in the expression can be a declaration, since that
892  would require a relocation, which is not possible.
893
894.. _langref_ifunc:
895
896IFuncs
897-------
898
899IFuncs, like as aliases, don't create any new data or func. They are just a new
900symbol that dynamic linker resolves at runtime by calling a resolver function.
901
902IFuncs have a name and a resolver that is a function called by dynamic linker
903that returns address of another function associated with the name.
904
905IFunc may have an optional :ref:`linkage type <linkage>` and an optional
906:ref:`visibility style <visibility>`.
907
908Syntax::
909
910    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
911
912
913.. _langref_comdats:
914
915Comdats
916-------
917
918Comdat IR provides access to object file COMDAT/section group functionality
919which represents interrelated sections.
920
921Comdats have a name which represents the COMDAT key and a selection kind to
922provide input on how the linker deduplicates comdats with the same key in two
923different object files. A comdat must be included or omitted as a unit.
924Discarding the whole comdat is allowed but discarding a subset is not.
925
926A global object may be a member of at most one comdat. Aliases are placed in the
927same COMDAT that their aliasee computes to, if any.
928
929Syntax::
930
931    $<Name> = comdat SelectionKind
932
933For selection kinds other than ``nodeduplicate``, only one of the duplicate
934comdats may be retained by the linker and the members of the remaining comdats
935must be discarded. The following selection kinds are supported:
936
937``any``
938    The linker may choose any COMDAT key, the choice is arbitrary.
939``exactmatch``
940    The linker may choose any COMDAT key but the sections must contain the
941    same data.
942``largest``
943    The linker will choose the section containing the largest COMDAT key.
944``nodeduplicate``
945    No deduplication is performed.
946``samesize``
947    The linker may choose any COMDAT key but the sections must contain the
948    same amount of data.
949
950- XCOFF and Mach-O don't support COMDATs.
951- COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
952  a non-local linkage COMDAT symbol.
953- ELF supports ``any`` and ``nodeduplicate``.
954- WebAssembly only supports ``any``.
955
956Here is an example of a COFF COMDAT where a function will only be selected if
957the COMDAT key's section is the largest:
958
959.. code-block:: text
960
961   $foo = comdat largest
962   @foo = global i32 2, comdat($foo)
963
964   define void @bar() comdat($foo) {
965     ret void
966   }
967
968In a COFF object file, this will create a COMDAT section with selection kind
969``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
970and another COMDAT section with selection kind
971``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
972section and contains the contents of the ``@bar`` symbol.
973
974As a syntactic sugar the ``$name`` can be omitted if the name is the same as
975the global name:
976
977.. code-block:: llvm
978
979  $foo = comdat any
980  @foo = global i32 2, comdat
981  @bar = global i32 3, comdat($foo)
982
983There are some restrictions on the properties of the global object.
984It, or an alias to it, must have the same name as the COMDAT group when
985targeting COFF.
986The contents and size of this object may be used during link-time to determine
987which COMDAT groups get selected depending on the selection kind.
988Because the name of the object must match the name of the COMDAT group, the
989linkage of the global object must not be local; local symbols can get renamed
990if a collision occurs in the symbol table.
991
992The combined use of COMDATS and section attributes may yield surprising results.
993For example:
994
995.. code-block:: llvm
996
997   $foo = comdat any
998   $bar = comdat any
999   @g1 = global i32 42, section "sec", comdat($foo)
1000   @g2 = global i32 42, section "sec", comdat($bar)
1001
1002From the object file perspective, this requires the creation of two sections
1003with the same name. This is necessary because both globals belong to different
1004COMDAT groups and COMDATs, at the object file level, are represented by
1005sections.
1006
1007Note that certain IR constructs like global variables and functions may
1008create COMDATs in the object file in addition to any which are specified using
1009COMDAT IR. This arises when the code generator is configured to emit globals
1010in individual sections (e.g. when `-data-sections` or `-function-sections`
1011is supplied to `llc`).
1012
1013.. _namedmetadatastructure:
1014
1015Named Metadata
1016--------------
1017
1018Named metadata is a collection of metadata. :ref:`Metadata
1019nodes <metadata>` (but not metadata strings) are the only valid
1020operands for a named metadata.
1021
1022#. Named metadata are represented as a string of characters with the
1023   metadata prefix. The rules for metadata names are the same as for
1024   identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1025   are still valid, which allows any character to be part of a name.
1026
1027Syntax::
1028
1029    ; Some unnamed metadata nodes, which are referenced by the named metadata.
1030    !0 = !{!"zero"}
1031    !1 = !{!"one"}
1032    !2 = !{!"two"}
1033    ; A named metadata.
1034    !name = !{!0, !1, !2}
1035
1036.. _paramattrs:
1037
1038Parameter Attributes
1039--------------------
1040
1041The return type and each parameter of a function type may have a set of
1042*parameter attributes* associated with them. Parameter attributes are
1043used to communicate additional information about the result or
1044parameters of a function. Parameter attributes are considered to be part
1045of the function, not of the function type, so functions with different
1046parameter attributes can have the same function type.
1047
1048Parameter attributes are simple keywords that follow the type specified.
1049If multiple parameter attributes are needed, they are space separated.
1050For example:
1051
1052.. code-block:: llvm
1053
1054    declare i32 @printf(i8* noalias nocapture, ...)
1055    declare i32 @atoi(i8 zeroext)
1056    declare signext i8 @returns_signed_char()
1057
1058Note that any attributes for the function result (``nounwind``,
1059``readonly``) come immediately after the argument list.
1060
1061Currently, only the following parameter attributes are defined:
1062
1063``zeroext``
1064    This indicates to the code generator that the parameter or return
1065    value should be zero-extended to the extent required by the target's
1066    ABI by the caller (for a parameter) or the callee (for a return value).
1067``signext``
1068    This indicates to the code generator that the parameter or return
1069    value should be sign-extended to the extent required by the target's
1070    ABI (which is usually 32-bits) by the caller (for a parameter) or
1071    the callee (for a return value).
1072``inreg``
1073    This indicates that this parameter or return value should be treated
1074    in a special target-dependent fashion while emitting code for
1075    a function call or return (usually, by putting it in a register as
1076    opposed to memory, though some targets use it to distinguish between
1077    two different kinds of registers). Use of this attribute is
1078    target-specific.
1079``byval(<ty>)``
1080    This indicates that the pointer parameter should really be passed by
1081    value to the function. The attribute implies that a hidden copy of
1082    the pointee is made between the caller and the callee, so the callee
1083    is unable to modify the value in the caller. This attribute is only
1084    valid on LLVM pointer arguments. It is generally used to pass
1085    structs and arrays by value, but is also valid on pointers to
1086    scalars. The copy is considered to belong to the caller not the
1087    callee (for example, ``readonly`` functions should not write to
1088    ``byval`` parameters). This is not a valid attribute for return
1089    values.
1090
1091    The byval type argument indicates the in-memory value type, and
1092    must be the same as the pointee type of the argument.
1093
1094    The byval attribute also supports specifying an alignment with the
1095    align attribute. It indicates the alignment of the stack slot to
1096    form and the known alignment of the pointer specified to the call
1097    site. If the alignment is not specified, then the code generator
1098    makes a target-specific assumption.
1099
1100.. _attr_byref:
1101
1102``byref(<ty>)``
1103
1104    The ``byref`` argument attribute allows specifying the pointee
1105    memory type of an argument. This is similar to ``byval``, but does
1106    not imply a copy is made anywhere, or that the argument is passed
1107    on the stack. This implies the pointer is dereferenceable up to
1108    the storage size of the type.
1109
1110    It is not generally permissible to introduce a write to an
1111    ``byref`` pointer. The pointer may have any address space and may
1112    be read only.
1113
1114    This is not a valid attribute for return values.
1115
1116    The alignment for an ``byref`` parameter can be explicitly
1117    specified by combining it with the ``align`` attribute, similar to
1118    ``byval``. If the alignment is not specified, then the code generator
1119    makes a target-specific assumption.
1120
1121    This is intended for representing ABI constraints, and is not
1122    intended to be inferred for optimization use.
1123
1124.. _attr_preallocated:
1125
1126``preallocated(<ty>)``
1127    This indicates that the pointer parameter should really be passed by
1128    value to the function, and that the pointer parameter's pointee has
1129    already been initialized before the call instruction. This attribute
1130    is only valid on LLVM pointer arguments. The argument must be the value
1131    returned by the appropriate
1132    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1133    ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1134    calls, although it is ignored during codegen.
1135
1136    A non ``musttail`` function call with a ``preallocated`` attribute in
1137    any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1138    function call cannot have a ``"preallocated"`` operand bundle.
1139
1140    The preallocated attribute requires a type argument, which must be
1141    the same as the pointee type of the argument.
1142
1143    The preallocated attribute also supports specifying an alignment with the
1144    align attribute. It indicates the alignment of the stack slot to
1145    form and the known alignment of the pointer specified to the call
1146    site. If the alignment is not specified, then the code generator
1147    makes a target-specific assumption.
1148
1149.. _attr_inalloca:
1150
1151``inalloca(<ty>)``
1152
1153    The ``inalloca`` argument attribute allows the caller to take the
1154    address of outgoing stack arguments. An ``inalloca`` argument must
1155    be a pointer to stack memory produced by an ``alloca`` instruction.
1156    The alloca, or argument allocation, must also be tagged with the
1157    inalloca keyword. Only the last argument may have the ``inalloca``
1158    attribute, and that argument is guaranteed to be passed in memory.
1159
1160    An argument allocation may be used by a call at most once because
1161    the call may deallocate it. The ``inalloca`` attribute cannot be
1162    used in conjunction with other attributes that affect argument
1163    storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1164    ``inalloca`` attribute also disables LLVM's implicit lowering of
1165    large aggregate return values, which means that frontend authors
1166    must lower them with ``sret`` pointers.
1167
1168    When the call site is reached, the argument allocation must have
1169    been the most recent stack allocation that is still live, or the
1170    behavior is undefined. It is possible to allocate additional stack
1171    space after an argument allocation and before its call site, but it
1172    must be cleared off with :ref:`llvm.stackrestore
1173    <int_stackrestore>`.
1174
1175    The inalloca attribute requires a type argument, which must be the
1176    same as the pointee type of the argument.
1177
1178    See :doc:`InAlloca` for more information on how to use this
1179    attribute.
1180
1181``sret(<ty>)``
1182    This indicates that the pointer parameter specifies the address of a
1183    structure that is the return value of the function in the source
1184    program. This pointer must be guaranteed by the caller to be valid:
1185    loads and stores to the structure may be assumed by the callee not
1186    to trap and to be properly aligned. This is not a valid attribute
1187    for return values.
1188
1189    The sret type argument specifies the in memory type, which must be
1190    the same as the pointee type of the argument.
1191
1192.. _attr_elementtype:
1193
1194``elementtype(<ty>)``
1195
1196    The ``elementtype`` argument attribute can be used to specify a pointer
1197    element type in a way that is compatible with `opaque pointers
1198    <OpaquePointers.html>`.
1199
1200    The ``elementtype`` attribute by itself does not carry any specific
1201    semantics. However, certain intrinsics may require this attribute to be
1202    present and assign it particular semantics. This will be documented on
1203    individual intrinsics.
1204
1205    The attribute may only be applied to pointer typed arguments of intrinsic
1206    calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1207    to parameters on function declarations. For non-opaque pointers, the type
1208    passed to ``elementtype`` must match the pointer element type.
1209
1210.. _attr_align:
1211
1212``align <n>`` or ``align(<n>)``
1213    This indicates that the pointer value has the specified alignment.
1214    If the pointer value does not have the specified alignment,
1215    :ref:`poison value <poisonvalues>` is returned or passed instead. The
1216    ``align`` attribute should be combined with the ``noundef`` attribute to
1217    ensure a pointer is aligned, or otherwise the behavior is undefined. Note
1218    that ``align 1`` has no effect on non-byval, non-preallocated arguments.
1219
1220    Note that this attribute has additional semantics when combined with the
1221    ``byval`` or ``preallocated`` attribute, which are documented there.
1222
1223.. _noalias:
1224
1225``noalias``
1226    This indicates that memory locations accessed via pointer values
1227    :ref:`based <pointeraliasing>` on the argument or return value are not also
1228    accessed, during the execution of the function, via pointer values not
1229    *based* on the argument or return value. This guarantee only holds for
1230    memory locations that are *modified*, by any means, during the execution of
1231    the function. The attribute on a return value also has additional semantics
1232    described below. The caller shares the responsibility with the callee for
1233    ensuring that these requirements are met.  For further details, please see
1234    the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1235    or No>`.
1236
1237    Note that this definition of ``noalias`` is intentionally similar
1238    to the definition of ``restrict`` in C99 for function arguments.
1239
1240    For function return values, C99's ``restrict`` is not meaningful,
1241    while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1242    attribute on return values are stronger than the semantics of the attribute
1243    when used on function arguments. On function return values, the ``noalias``
1244    attribute indicates that the function acts like a system memory allocation
1245    function, returning a pointer to allocated storage disjoint from the
1246    storage for any other object accessible to the caller.
1247
1248.. _nocapture:
1249
1250``nocapture``
1251    This indicates that the callee does not :ref:`capture <pointercapture>` the
1252    pointer. This is not a valid attribute for return values.
1253    This attribute applies only to the particular copy of the pointer passed in
1254    this argument. A caller could pass two copies of the same pointer with one
1255    being annotated nocapture and the other not, and the callee could validly
1256    capture through the non annotated parameter.
1257
1258.. code-block:: llvm
1259
1260    define void @f(i8* nocapture %a, i8* %b) {
1261      ; (capture %b)
1262    }
1263
1264    call void @f(i8* @glb, i8* @glb) ; well-defined
1265
1266``nofree``
1267    This indicates that callee does not free the pointer argument. This is not
1268    a valid attribute for return values.
1269
1270.. _nest:
1271
1272``nest``
1273    This indicates that the pointer parameter can be excised using the
1274    :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1275    attribute for return values and can only be applied to one parameter.
1276
1277``returned``
1278    This indicates that the function always returns the argument as its return
1279    value. This is a hint to the optimizer and code generator used when
1280    generating the caller, allowing value propagation, tail call optimization,
1281    and omission of register saves and restores in some cases; it is not
1282    checked or enforced when generating the callee. The parameter and the
1283    function return type must be valid operands for the
1284    :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1285    return values and can only be applied to one parameter.
1286
1287``nonnull``
1288    This indicates that the parameter or return pointer is not null. This
1289    attribute may only be applied to pointer typed parameters. This is not
1290    checked or enforced by LLVM; if the parameter or return pointer is null,
1291    :ref:`poison value <poisonvalues>` is returned or passed instead.
1292    The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1293    to ensure a pointer is not null or otherwise the behavior is undefined.
1294
1295``dereferenceable(<n>)``
1296    This indicates that the parameter or return pointer is dereferenceable. This
1297    attribute may only be applied to pointer typed parameters. A pointer that
1298    is dereferenceable can be loaded from speculatively without a risk of
1299    trapping. The number of bytes known to be dereferenceable must be provided
1300    in parentheses. It is legal for the number of bytes to be less than the
1301    size of the pointee type. The ``nonnull`` attribute does not imply
1302    dereferenceability (consider a pointer to one element past the end of an
1303    array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1304    ``addrspace(0)`` (which is the default address space), except if the
1305    ``null_pointer_is_valid`` function attribute is present.
1306    ``n`` should be a positive number. The pointer should be well defined,
1307    otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1308    implies ``noundef``.
1309
1310``dereferenceable_or_null(<n>)``
1311    This indicates that the parameter or return value isn't both
1312    non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1313    time. All non-null pointers tagged with
1314    ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1315    For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1316    a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1317    and in other address spaces ``dereferenceable_or_null(<n>)``
1318    implies that a pointer is at least one of ``dereferenceable(<n>)``
1319    or ``null`` (i.e. it may be both ``null`` and
1320    ``dereferenceable(<n>)``). This attribute may only be applied to
1321    pointer typed parameters.
1322
1323``swiftself``
1324    This indicates that the parameter is the self/context parameter. This is not
1325    a valid attribute for return values and can only be applied to one
1326    parameter.
1327
1328``swiftasync``
1329    This indicates that the parameter is the asynchronous context parameter and
1330    triggers the creation of a target-specific extended frame record to store
1331    this pointer. This is not a valid attribute for return values and can only
1332    be applied to one parameter.
1333
1334``swifterror``
1335    This attribute is motivated to model and optimize Swift error handling. It
1336    can be applied to a parameter with pointer to pointer type or a
1337    pointer-sized alloca. At the call site, the actual argument that corresponds
1338    to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1339    the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1340    the parameter or the alloca) can only be loaded and stored from, or used as
1341    a ``swifterror`` argument. This is not a valid attribute for return values
1342    and can only be applied to one parameter.
1343
1344    These constraints allow the calling convention to optimize access to
1345    ``swifterror`` variables by associating them with a specific register at
1346    call boundaries rather than placing them in memory. Since this does change
1347    the calling convention, a function which uses the ``swifterror`` attribute
1348    on a parameter is not ABI-compatible with one which does not.
1349
1350    These constraints also allow LLVM to assume that a ``swifterror`` argument
1351    does not alias any other memory visible within a function and that a
1352    ``swifterror`` alloca passed as an argument does not escape.
1353
1354``immarg``
1355    This indicates the parameter is required to be an immediate
1356    value. This must be a trivial immediate integer or floating-point
1357    constant. Undef or constant expressions are not valid. This is
1358    only valid on intrinsic declarations and cannot be applied to a
1359    call site or arbitrary function.
1360
1361``noundef``
1362    This attribute applies to parameters and return values. If the value
1363    representation contains any undefined or poison bits, the behavior is
1364    undefined. Note that this does not refer to padding introduced by the
1365    type's storage representation.
1366
1367``alignstack(<n>)``
1368    This indicates the alignment that should be considered by the backend when
1369    assigning this parameter to a stack slot during calling convention
1370    lowering. The enforcement of the specified alignment is target-dependent,
1371    as target-specific calling convention rules may override this value. This
1372    attribute serves the purpose of carrying language specific alignment
1373    information that is not mapped to base types in the backend (for example,
1374    over-alignment specification through language attributes).
1375
1376.. _gc:
1377
1378Garbage Collector Strategy Names
1379--------------------------------
1380
1381Each function may specify a garbage collector strategy name, which is simply a
1382string:
1383
1384.. code-block:: llvm
1385
1386    define void @f() gc "name" { ... }
1387
1388The supported values of *name* includes those :ref:`built in to LLVM
1389<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1390strategy will cause the compiler to alter its output in order to support the
1391named garbage collection algorithm. Note that LLVM itself does not contain a
1392garbage collector, this functionality is restricted to generating machine code
1393which can interoperate with a collector provided externally.
1394
1395.. _prefixdata:
1396
1397Prefix Data
1398-----------
1399
1400Prefix data is data associated with a function which the code
1401generator will emit immediately before the function's entrypoint.
1402The purpose of this feature is to allow frontends to associate
1403language-specific runtime metadata with specific functions and make it
1404available through the function pointer while still allowing the
1405function pointer to be called.
1406
1407To access the data for a given function, a program may bitcast the
1408function pointer to a pointer to the constant's type and dereference
1409index -1. This implies that the IR symbol points just past the end of
1410the prefix data. For instance, take the example of a function annotated
1411with a single ``i32``,
1412
1413.. code-block:: llvm
1414
1415    define void @f() prefix i32 123 { ... }
1416
1417The prefix data can be referenced as,
1418
1419.. code-block:: llvm
1420
1421    %0 = bitcast void* () @f to i32*
1422    %a = getelementptr inbounds i32, i32* %0, i32 -1
1423    %b = load i32, i32* %a
1424
1425Prefix data is laid out as if it were an initializer for a global variable
1426of the prefix data's type. The function will be placed such that the
1427beginning of the prefix data is aligned. This means that if the size
1428of the prefix data is not a multiple of the alignment size, the
1429function's entrypoint will not be aligned. If alignment of the
1430function's entrypoint is desired, padding must be added to the prefix
1431data.
1432
1433A function may have prefix data but no body. This has similar semantics
1434to the ``available_externally`` linkage in that the data may be used by the
1435optimizers but will not be emitted in the object file.
1436
1437.. _prologuedata:
1438
1439Prologue Data
1440-------------
1441
1442The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1443be inserted prior to the function body. This can be used for enabling
1444function hot-patching and instrumentation.
1445
1446To maintain the semantics of ordinary function calls, the prologue data must
1447have a particular format. Specifically, it must begin with a sequence of
1448bytes which decode to a sequence of machine instructions, valid for the
1449module's target, which transfer control to the point immediately succeeding
1450the prologue data, without performing any other visible action. This allows
1451the inliner and other passes to reason about the semantics of the function
1452definition without needing to reason about the prologue data. Obviously this
1453makes the format of the prologue data highly target dependent.
1454
1455A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1456which encodes the ``nop`` instruction:
1457
1458.. code-block:: text
1459
1460    define void @f() prologue i8 144 { ... }
1461
1462Generally prologue data can be formed by encoding a relative branch instruction
1463which skips the metadata, as in this example of valid prologue data for the
1464x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1465
1466.. code-block:: text
1467
1468    %0 = type <{ i8, i8, i8* }>
1469
1470    define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
1471
1472A function may have prologue data but no body. This has similar semantics
1473to the ``available_externally`` linkage in that the data may be used by the
1474optimizers but will not be emitted in the object file.
1475
1476.. _personalityfn:
1477
1478Personality Function
1479--------------------
1480
1481The ``personality`` attribute permits functions to specify what function
1482to use for exception handling.
1483
1484.. _attrgrp:
1485
1486Attribute Groups
1487----------------
1488
1489Attribute groups are groups of attributes that are referenced by objects within
1490the IR. They are important for keeping ``.ll`` files readable, because a lot of
1491functions will use the same set of attributes. In the degenerative case of a
1492``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1493group will capture the important command line flags used to build that file.
1494
1495An attribute group is a module-level object. To use an attribute group, an
1496object references the attribute group's ID (e.g. ``#37``). An object may refer
1497to more than one attribute group. In that situation, the attributes from the
1498different groups are merged.
1499
1500Here is an example of attribute groups for a function that should always be
1501inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1502
1503.. code-block:: llvm
1504
1505   ; Target-independent attributes:
1506   attributes #0 = { alwaysinline alignstack=4 }
1507
1508   ; Target-dependent attributes:
1509   attributes #1 = { "no-sse" }
1510
1511   ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1512   define void @f() #0 #1 { ... }
1513
1514.. _fnattrs:
1515
1516Function Attributes
1517-------------------
1518
1519Function attributes are set to communicate additional information about
1520a function. Function attributes are considered to be part of the
1521function, not of the function type, so functions with different function
1522attributes can have the same function type.
1523
1524Function attributes are simple keywords that follow the type specified.
1525If multiple attributes are needed, they are space separated. For
1526example:
1527
1528.. code-block:: llvm
1529
1530    define void @f() noinline { ... }
1531    define void @f() alwaysinline { ... }
1532    define void @f() alwaysinline optsize { ... }
1533    define void @f() optsize { ... }
1534
1535``alignstack(<n>)``
1536    This attribute indicates that, when emitting the prologue and
1537    epilogue, the backend should forcibly align the stack pointer.
1538    Specify the desired alignment, which must be a power of two, in
1539    parentheses.
1540``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1541    This attribute indicates that the annotated function will always return at
1542    least a given number of bytes (or null). Its arguments are zero-indexed
1543    parameter numbers; if one argument is provided, then it's assumed that at
1544    least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1545    returned pointer. If two are provided, then it's assumed that
1546    ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1547    available. The referenced parameters must be integer types. No assumptions
1548    are made about the contents of the returned block of memory.
1549``alwaysinline``
1550    This attribute indicates that the inliner should attempt to inline
1551    this function into callers whenever possible, ignoring any active
1552    inlining size threshold for this caller.
1553``builtin``
1554    This indicates that the callee function at a call site should be
1555    recognized as a built-in function, even though the function's declaration
1556    uses the ``nobuiltin`` attribute. This is only valid at call sites for
1557    direct calls to functions that are declared with the ``nobuiltin``
1558    attribute.
1559``cold``
1560    This attribute indicates that this function is rarely called. When
1561    computing edge weights, basic blocks post-dominated by a cold
1562    function call are also considered to be cold; and, thus, given low
1563    weight.
1564``convergent``
1565    In some parallel execution models, there exist operations that cannot be
1566    made control-dependent on any additional values.  We call such operations
1567    ``convergent``, and mark them with this attribute.
1568
1569    The ``convergent`` attribute may appear on functions or call/invoke
1570    instructions.  When it appears on a function, it indicates that calls to
1571    this function should not be made control-dependent on additional values.
1572    For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1573    calls to this intrinsic cannot be made control-dependent on additional
1574    values.
1575
1576    When it appears on a call/invoke, the ``convergent`` attribute indicates
1577    that we should treat the call as though we're calling a convergent
1578    function.  This is particularly useful on indirect calls; without this we
1579    may treat such calls as though the target is non-convergent.
1580
1581    The optimizer may remove the ``convergent`` attribute on functions when it
1582    can prove that the function does not execute any convergent operations.
1583    Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1584    can prove that the call/invoke cannot call a convergent function.
1585``disable_sanitizer_instrumentation``
1586    When instrumenting code with sanitizers, it can be important to skip certain
1587    functions to ensure no instrumentation is applied to them.
1588
1589    This attribute is not always similar to absent ``sanitize_<name>``
1590    attributes: depending on the specific sanitizer, code can be inserted into
1591    functions regardless of the ``sanitize_<name>`` attribute to prevent false
1592    positive reports.
1593
1594    ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1595    taking precedence over the ``sanitize_<name>`` attributes and other compiler
1596    flags.
1597``"dontcall-error"``
1598    This attribute denotes that an error diagnostic should be emitted when a
1599    call of a function with this attribute is not eliminated via optimization.
1600    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1601    such callees to attach information about where in the source language such a
1602    call came from. A string value can be provided as a note.
1603``"dontcall-warn"``
1604    This attribute denotes that a warning diagnostic should be emitted when a
1605    call of a function with this attribute is not eliminated via optimization.
1606    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1607    such callees to attach information about where in the source language such a
1608    call came from. A string value can be provided as a note.
1609``"frame-pointer"``
1610    This attribute tells the code generator whether the function
1611    should keep the frame pointer. The code generator may emit the frame pointer
1612    even if this attribute says the frame pointer can be eliminated.
1613    The allowed string values are:
1614
1615     * ``"none"`` (default) - the frame pointer can be eliminated.
1616     * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1617       other functions.
1618     * ``"all"`` - the frame pointer should be kept.
1619``hot``
1620    This attribute indicates that this function is a hot spot of the program
1621    execution. The function will be optimized more aggressively and will be
1622    placed into special subsection of the text section to improving locality.
1623
1624    When profile feedback is enabled, this attribute has the precedence over
1625    the profile information. By marking a function ``hot``, users can work
1626    around the cases where the training input does not have good coverage
1627    on all the hot functions.
1628``inaccessiblememonly``
1629    This attribute indicates that the function may only access memory that
1630    is not accessible by the module being compiled. This is a weaker form
1631    of ``readnone``. If the function reads or writes other memory, the
1632    behavior is undefined.
1633``inaccessiblemem_or_argmemonly``
1634    This attribute indicates that the function may only access memory that is
1635    either not accessible by the module being compiled, or is pointed to
1636    by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1637    function reads or writes other memory, the behavior is undefined.
1638``inlinehint``
1639    This attribute indicates that the source code contained a hint that
1640    inlining this function is desirable (such as the "inline" keyword in
1641    C/C++). It is just a hint; it imposes no requirements on the
1642    inliner.
1643``jumptable``
1644    This attribute indicates that the function should be added to a
1645    jump-instruction table at code-generation time, and that all address-taken
1646    references to this function should be replaced with a reference to the
1647    appropriate jump-instruction-table function pointer. Note that this creates
1648    a new pointer for the original function, which means that code that depends
1649    on function-pointer identity can break. So, any function annotated with
1650    ``jumptable`` must also be ``unnamed_addr``.
1651``minsize``
1652    This attribute suggests that optimization passes and code generator
1653    passes make choices that keep the code size of this function as small
1654    as possible and perform optimizations that may sacrifice runtime
1655    performance in order to minimize the size of the generated code.
1656``naked``
1657    This attribute disables prologue / epilogue emission for the
1658    function. This can have very system-specific consequences.
1659``"no-inline-line-tables"``
1660    When this attribute is set to true, the inliner discards source locations
1661    when inlining code and instead uses the source location of the call site.
1662    Breakpoints set on code that was inlined into the current function will
1663    not fire during the execution of the inlined call sites. If the debugger
1664    stops inside an inlined call site, it will appear to be stopped at the
1665    outermost inlined call site.
1666``no-jump-tables``
1667    When this attribute is set to true, the jump tables and lookup tables that
1668    can be generated from a switch case lowering are disabled.
1669``nobuiltin``
1670    This indicates that the callee function at a call site is not recognized as
1671    a built-in function. LLVM will retain the original call and not replace it
1672    with equivalent code based on the semantics of the built-in function, unless
1673    the call site uses the ``builtin`` attribute. This is valid at call sites
1674    and on function declarations and definitions.
1675``noduplicate``
1676    This attribute indicates that calls to the function cannot be
1677    duplicated. A call to a ``noduplicate`` function may be moved
1678    within its parent function, but may not be duplicated within
1679    its parent function.
1680
1681    A function containing a ``noduplicate`` call may still
1682    be an inlining candidate, provided that the call is not
1683    duplicated by inlining. That implies that the function has
1684    internal linkage and only has one call site, so the original
1685    call is dead after inlining.
1686``nofree``
1687    This function attribute indicates that the function does not, directly or
1688    transitively, call a memory-deallocation function (``free``, for example)
1689    on a memory allocation which existed before the call.
1690
1691    As a result, uncaptured pointers that are known to be dereferenceable
1692    prior to a call to a function with the ``nofree`` attribute are still
1693    known to be dereferenceable after the call. The capturing condition is
1694    necessary in environments where the function might communicate the
1695    pointer to another thread which then deallocates the memory.  Alternatively,
1696    ``nosync`` would ensure such communication cannot happen and even captured
1697    pointers cannot be freed by the function.
1698
1699    A ``nofree`` function is explicitly allowed to free memory which it
1700    allocated or (if not ``nosync``) arrange for another thread to free
1701    memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
1702    function can return a pointer to a previously deallocated memory object.
1703``noimplicitfloat``
1704    Disallows implicit floating-point code. This inhibits optimizations that
1705    use floating-point code and floating-point/SIMD/vector registers for
1706    operations that are not nominally floating-point. LLVM instructions that
1707    perform floating-point operations or require access to floating-point
1708    registers may still cause floating-point code to be generated.
1709``noinline``
1710    This attribute indicates that the inliner should never inline this
1711    function in any situation. This attribute may not be used together
1712    with the ``alwaysinline`` attribute.
1713``nomerge``
1714    This attribute indicates that calls to this function should never be merged
1715    during optimization. For example, it will prevent tail merging otherwise
1716    identical code sequences that raise an exception or terminate the program.
1717    Tail merging normally reduces the precision of source location information,
1718    making stack traces less useful for debugging. This attribute gives the
1719    user control over the tradeoff between code size and debug information
1720    precision.
1721``nonlazybind``
1722    This attribute suppresses lazy symbol binding for the function. This
1723    may make calls to the function faster, at the cost of extra program
1724    startup time if the function is not called during program startup.
1725``noprofile``
1726    This function attribute prevents instrumentation based profiling, used for
1727    coverage or profile based optimization, from being added to a function,
1728    even when inlined.
1729``noredzone``
1730    This attribute indicates that the code generator should not use a
1731    red zone, even if the target-specific ABI normally permits it.
1732``indirect-tls-seg-refs``
1733    This attribute indicates that the code generator should not use
1734    direct TLS access through segment registers, even if the
1735    target-specific ABI normally permits it.
1736``noreturn``
1737    This function attribute indicates that the function never returns
1738    normally, hence through a return instruction. This produces undefined
1739    behavior at runtime if the function ever does dynamically return. Annotated
1740    functions may still raise an exception, i.a., ``nounwind`` is not implied.
1741``norecurse``
1742    This function attribute indicates that the function does not call itself
1743    either directly or indirectly down any possible call path. This produces
1744    undefined behavior at runtime if the function ever does recurse.
1745``willreturn``
1746    This function attribute indicates that a call of this function will
1747    either exhibit undefined behavior or comes back and continues execution
1748    at a point in the existing call stack that includes the current invocation.
1749    Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1750    If an invocation of an annotated function does not return control back
1751    to a point in the call stack, the behavior is undefined.
1752``nosync``
1753    This function attribute indicates that the function does not communicate
1754    (synchronize) with another thread through memory or other well-defined means.
1755    Synchronization is considered possible in the presence of `atomic` accesses
1756    that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1757    as well as `convergent` function calls. Note that through `convergent` function calls
1758    non-memory communication, e.g., cross-lane operations, are possible and are also
1759    considered synchronization. However `convergent` does not contradict `nosync`.
1760    If an annotated function does ever synchronize with another thread,
1761    the behavior is undefined.
1762``nounwind``
1763    This function attribute indicates that the function never raises an
1764    exception. If the function does raise an exception, its runtime
1765    behavior is undefined. However, functions marked nounwind may still
1766    trap or generate asynchronous exceptions. Exception handling schemes
1767    that are recognized by LLVM to handle asynchronous exceptions, such
1768    as SEH, will still provide their implementation defined semantics.
1769``nosanitize_coverage``
1770    This attribute indicates that SanitizerCoverage instrumentation is disabled
1771    for this function.
1772``null_pointer_is_valid``
1773   If ``null_pointer_is_valid`` is set, then the ``null`` address
1774   in address-space 0 is considered to be a valid address for memory loads and
1775   stores. Any analysis or optimization should not treat dereferencing a
1776   pointer to ``null`` as undefined behavior in this function.
1777   Note: Comparing address of a global variable to ``null`` may still
1778   evaluate to false because of a limitation in querying this attribute inside
1779   constant expressions.
1780``optforfuzzing``
1781    This attribute indicates that this function should be optimized
1782    for maximum fuzzing signal.
1783``optnone``
1784    This function attribute indicates that most optimization passes will skip
1785    this function, with the exception of interprocedural optimization passes.
1786    Code generation defaults to the "fast" instruction selector.
1787    This attribute cannot be used together with the ``alwaysinline``
1788    attribute; this attribute is also incompatible
1789    with the ``minsize`` attribute and the ``optsize`` attribute.
1790
1791    This attribute requires the ``noinline`` attribute to be specified on
1792    the function as well, so the function is never inlined into any caller.
1793    Only functions with the ``alwaysinline`` attribute are valid
1794    candidates for inlining into the body of this function.
1795``optsize``
1796    This attribute suggests that optimization passes and code generator
1797    passes make choices that keep the code size of this function low,
1798    and otherwise do optimizations specifically to reduce code size as
1799    long as they do not significantly impact runtime performance.
1800``"patchable-function"``
1801    This attribute tells the code generator that the code
1802    generated for this function needs to follow certain conventions that
1803    make it possible for a runtime function to patch over it later.
1804    The exact effect of this attribute depends on its string value,
1805    for which there currently is one legal possibility:
1806
1807     * ``"prologue-short-redirect"`` - This style of patchable
1808       function is intended to support patching a function prologue to
1809       redirect control away from the function in a thread safe
1810       manner.  It guarantees that the first instruction of the
1811       function will be large enough to accommodate a short jump
1812       instruction, and will be sufficiently aligned to allow being
1813       fully changed via an atomic compare-and-swap instruction.
1814       While the first requirement can be satisfied by inserting large
1815       enough NOP, LLVM can and will try to re-purpose an existing
1816       instruction (i.e. one that would have to be emitted anyway) as
1817       the patchable instruction larger than a short jump.
1818
1819       ``"prologue-short-redirect"`` is currently only supported on
1820       x86-64.
1821
1822    This attribute by itself does not imply restrictions on
1823    inter-procedural optimizations.  All of the semantic effects the
1824    patching may have to be separately conveyed via the linkage type.
1825``"probe-stack"``
1826    This attribute indicates that the function will trigger a guard region
1827    in the end of the stack. It ensures that accesses to the stack must be
1828    no further apart than the size of the guard region to a previous
1829    access of the stack. It takes one required string value, the name of
1830    the stack probing function that will be called.
1831
1832    If a function that has a ``"probe-stack"`` attribute is inlined into
1833    a function with another ``"probe-stack"`` attribute, the resulting
1834    function has the ``"probe-stack"`` attribute of the caller. If a
1835    function that has a ``"probe-stack"`` attribute is inlined into a
1836    function that has no ``"probe-stack"`` attribute at all, the resulting
1837    function has the ``"probe-stack"`` attribute of the callee.
1838``readnone``
1839    On a function, this attribute indicates that the function computes its
1840    result (or decides to unwind an exception) based strictly on its arguments,
1841    without dereferencing any pointer arguments or otherwise accessing
1842    any mutable state (e.g. memory, control registers, etc) visible to
1843    caller functions. It does not write through any pointer arguments
1844    (including ``byval`` arguments) and never changes any state visible
1845    to callers. This means while it cannot unwind exceptions by calling
1846    the ``C++`` exception throwing methods (since they write to memory), there may
1847    be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1848    visible memory.
1849
1850    On an argument, this attribute indicates that the function does not
1851    dereference that pointer argument, even though it may read or write the
1852    memory that the pointer points to if accessed through other pointers.
1853
1854    If a readnone function reads or writes memory visible to the program, or
1855    has other side-effects, the behavior is undefined. If a function reads from
1856    or writes to a readnone pointer argument, the behavior is undefined.
1857``readonly``
1858    On a function, this attribute indicates that the function does not write
1859    through any pointer arguments (including ``byval`` arguments) or otherwise
1860    modify any state (e.g. memory, control registers, etc) visible to
1861    caller functions. It may dereference pointer arguments and read
1862    state that may be set in the caller. A readonly function always
1863    returns the same value (or unwinds an exception identically) when
1864    called with the same set of arguments and global state.  This means while it
1865    cannot unwind exceptions by calling the ``C++`` exception throwing methods
1866    (since they write to memory), there may be non-``C++`` mechanisms that throw
1867    exceptions without writing to LLVM visible memory.
1868
1869    On an argument, this attribute indicates that the function does not write
1870    through this pointer argument, even though it may write to the memory that
1871    the pointer points to.
1872
1873    If a readonly function writes memory visible to the program, or
1874    has other side-effects, the behavior is undefined. If a function writes to
1875    a readonly pointer argument, the behavior is undefined.
1876``"stack-probe-size"``
1877    This attribute controls the behavior of stack probes: either
1878    the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1879    It defines the size of the guard region. It ensures that if the function
1880    may use more stack space than the size of the guard region, stack probing
1881    sequence will be emitted. It takes one required integer value, which
1882    is 4096 by default.
1883
1884    If a function that has a ``"stack-probe-size"`` attribute is inlined into
1885    a function with another ``"stack-probe-size"`` attribute, the resulting
1886    function has the ``"stack-probe-size"`` attribute that has the lower
1887    numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1888    inlined into a function that has no ``"stack-probe-size"`` attribute
1889    at all, the resulting function has the ``"stack-probe-size"`` attribute
1890    of the callee.
1891``"no-stack-arg-probe"``
1892    This attribute disables ABI-required stack probes, if any.
1893``writeonly``
1894    On a function, this attribute indicates that the function may write to but
1895    does not read from memory.
1896
1897    On an argument, this attribute indicates that the function may write to but
1898    does not read through this pointer argument (even though it may read from
1899    the memory that the pointer points to).
1900
1901    If a writeonly function reads memory visible to the program, or
1902    has other side-effects, the behavior is undefined. If a function reads
1903    from a writeonly pointer argument, the behavior is undefined.
1904``argmemonly``
1905    This attribute indicates that the only memory accesses inside function are
1906    loads and stores from objects pointed to by its pointer-typed arguments,
1907    with arbitrary offsets. Or in other words, all memory operations in the
1908    function can refer to memory only using pointers based on its function
1909    arguments.
1910
1911    Note that ``argmemonly`` can be used together with ``readonly`` attribute
1912    in order to specify that function reads only from its arguments.
1913
1914    If an argmemonly function reads or writes memory other than the pointer
1915    arguments, or has other side-effects, the behavior is undefined.
1916``returns_twice``
1917    This attribute indicates that this function can return twice. The C
1918    ``setjmp`` is an example of such a function. The compiler disables
1919    some optimizations (like tail calls) in the caller of these
1920    functions.
1921``safestack``
1922    This attribute indicates that
1923    `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
1924    protection is enabled for this function.
1925
1926    If a function that has a ``safestack`` attribute is inlined into a
1927    function that doesn't have a ``safestack`` attribute or which has an
1928    ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1929    function will have a ``safestack`` attribute.
1930``sanitize_address``
1931    This attribute indicates that AddressSanitizer checks
1932    (dynamic address safety analysis) are enabled for this function.
1933``sanitize_memory``
1934    This attribute indicates that MemorySanitizer checks (dynamic detection
1935    of accesses to uninitialized memory) are enabled for this function.
1936``sanitize_thread``
1937    This attribute indicates that ThreadSanitizer checks
1938    (dynamic thread safety analysis) are enabled for this function.
1939``sanitize_hwaddress``
1940    This attribute indicates that HWAddressSanitizer checks
1941    (dynamic address safety analysis based on tagged pointers) are enabled for
1942    this function.
1943``sanitize_memtag``
1944    This attribute indicates that MemTagSanitizer checks
1945    (dynamic address safety analysis based on Armv8 MTE) are enabled for
1946    this function.
1947``speculative_load_hardening``
1948    This attribute indicates that
1949    `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
1950    should be enabled for the function body.
1951
1952    Speculative Load Hardening is a best-effort mitigation against
1953    information leak attacks that make use of control flow
1954    miss-speculation - specifically miss-speculation of whether a branch
1955    is taken or not. Typically vulnerabilities enabling such attacks are
1956    classified as "Spectre variant #1". Notably, this does not attempt to
1957    mitigate against miss-speculation of branch target, classified as
1958    "Spectre variant #2" vulnerabilities.
1959
1960    When inlining, the attribute is sticky. Inlining a function that carries
1961    this attribute will cause the caller to gain the attribute. This is intended
1962    to provide a maximally conservative model where the code in a function
1963    annotated with this attribute will always (even after inlining) end up
1964    hardened.
1965``speculatable``
1966    This function attribute indicates that the function does not have any
1967    effects besides calculating its result and does not have undefined behavior.
1968    Note that ``speculatable`` is not enough to conclude that along any
1969    particular execution path the number of calls to this function will not be
1970    externally observable. This attribute is only valid on functions
1971    and declarations, not on individual call sites. If a function is
1972    incorrectly marked as speculatable and really does exhibit
1973    undefined behavior, the undefined behavior may be observed even
1974    if the call site is dead code.
1975
1976``ssp``
1977    This attribute indicates that the function should emit a stack
1978    smashing protector. It is in the form of a "canary" --- a random value
1979    placed on the stack before the local variables that's checked upon
1980    return from the function to see if it has been overwritten. A
1981    heuristic is used to determine if a function needs stack protectors
1982    or not. The heuristic used will enable protectors for functions with:
1983
1984    - Character arrays larger than ``ssp-buffer-size`` (default 8).
1985    - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1986    - Calls to alloca() with variable sizes or constant sizes greater than
1987      ``ssp-buffer-size``.
1988
1989    Variables that are identified as requiring a protector will be arranged
1990    on the stack such that they are adjacent to the stack protector guard.
1991
1992    A function with the ``ssp`` attribute but without the ``alwaysinline``
1993    attribute cannot be inlined into a function without a
1994    ``ssp/sspreq/sspstrong`` attribute. If inlined, the caller will get the
1995    ``ssp`` attribute. ``call``, ``invoke``, and ``callbr`` instructions with
1996    the ``alwaysinline`` attribute force inlining.
1997``sspstrong``
1998    This attribute indicates that the function should emit a stack smashing
1999    protector. This attribute causes a strong heuristic to be used when
2000    determining if a function needs stack protectors. The strong heuristic
2001    will enable protectors for functions with:
2002
2003    - Arrays of any size and type
2004    - Aggregates containing an array of any size and type.
2005    - Calls to alloca().
2006    - Local variables that have had their address taken.
2007
2008    Variables that are identified as requiring a protector will be arranged
2009    on the stack such that they are adjacent to the stack protector guard.
2010    The specific layout rules are:
2011
2012    #. Large arrays and structures containing large arrays
2013       (``>= ssp-buffer-size``) are closest to the stack protector.
2014    #. Small arrays and structures containing small arrays
2015       (``< ssp-buffer-size``) are 2nd closest to the protector.
2016    #. Variables that have had their address taken are 3rd closest to the
2017       protector.
2018
2019    This overrides the ``ssp`` function attribute.
2020
2021    A function with the ``sspstrong`` attribute but without the
2022    ``alwaysinline`` attribute cannot be inlined into a function without a
2023    ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2024    ``sspstrong`` attribute unless the ``sspreq`` attribute exists.  ``call``,
2025    ``invoke``, and ``callbr`` instructions with the ``alwaysinline`` attribute
2026    force inlining.
2027``sspreq``
2028    This attribute indicates that the function should *always* emit a stack
2029    smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2030    attributes.
2031
2032    Variables that are identified as requiring a protector will be arranged
2033    on the stack such that they are adjacent to the stack protector guard.
2034    The specific layout rules are:
2035
2036    #. Large arrays and structures containing large arrays
2037       (``>= ssp-buffer-size``) are closest to the stack protector.
2038    #. Small arrays and structures containing small arrays
2039       (``< ssp-buffer-size``) are 2nd closest to the protector.
2040    #. Variables that have had their address taken are 3rd closest to the
2041       protector.
2042
2043    A function with the ``sspreq`` attribute but without the ``alwaysinline``
2044    attribute cannot be inlined into a function without a
2045    ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2046    ``sspreq`` attribute.  ``call``, ``invoke``, and ``callbr`` instructions
2047    with the ``alwaysinline`` attribute force inlining.
2048
2049``strictfp``
2050    This attribute indicates that the function was called from a scope that
2051    requires strict floating-point semantics.  LLVM will not attempt any
2052    optimizations that require assumptions about the floating-point rounding
2053    mode or that might alter the state of floating-point status flags that
2054    might otherwise be set or cleared by calling this function. LLVM will
2055    not introduce any new floating-point instructions that may trap.
2056
2057``"denormal-fp-math"``
2058    This indicates the denormal (subnormal) handling that may be
2059    assumed for the default floating-point environment. This is a
2060    comma separated pair. The elements may be one of ``"ieee"``,
2061    ``"preserve-sign"``, or ``"positive-zero"``. The first entry
2062    indicates the flushing mode for the result of floating point
2063    operations. The second indicates the handling of denormal inputs
2064    to floating point instructions. For compatibility with older
2065    bitcode, if the second value is omitted, both input and output
2066    modes will assume the same mode.
2067
2068    If this is attribute is not specified, the default is
2069    ``"ieee,ieee"``.
2070
2071    If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2072    denormal outputs may be flushed to zero by standard floating-point
2073    operations. It is not mandated that flushing to zero occurs, but if
2074    a denormal output is flushed to zero, it must respect the sign
2075    mode. Not all targets support all modes. While this indicates the
2076    expected floating point mode the function will be executed with,
2077    this does not make any attempt to ensure the mode is
2078    consistent. User or platform code is expected to set the floating
2079    point mode appropriately before function entry.
2080
2081   If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a
2082   floating-point operation must treat any input denormal value as
2083   zero. In some situations, if an instruction does not respect this
2084   mode, the input may need to be converted to 0 as if by
2085   ``@llvm.canonicalize`` during lowering for correctness.
2086
2087``"denormal-fp-math-f32"``
2088    Same as ``"denormal-fp-math"``, but only controls the behavior of
2089    the 32-bit float type (or vectors of 32-bit floats). If both are
2090    are present, this overrides ``"denormal-fp-math"``. Not all targets
2091    support separately setting the denormal mode per type, and no
2092    attempt is made to diagnose unsupported uses. Currently this
2093    attribute is respected by the AMDGPU and NVPTX backends.
2094
2095``"thunk"``
2096    This attribute indicates that the function will delegate to some other
2097    function with a tail call. The prototype of a thunk should not be used for
2098    optimization purposes. The caller is expected to cast the thunk prototype to
2099    match the thunk target prototype.
2100``uwtable``
2101    This attribute indicates that the ABI being targeted requires that
2102    an unwind table entry be produced for this function even if we can
2103    show that no exceptions passes by it. This is normally the case for
2104    the ELF x86-64 abi, but it can be disabled for some compilation
2105    units.
2106``nocf_check``
2107    This attribute indicates that no control-flow check will be performed on
2108    the attributed entity. It disables -fcf-protection=<> for a specific
2109    entity to fine grain the HW control flow protection mechanism. The flag
2110    is target independent and currently appertains to a function or function
2111    pointer.
2112``shadowcallstack``
2113    This attribute indicates that the ShadowCallStack checks are enabled for
2114    the function. The instrumentation checks that the return address for the
2115    function has not changed between the function prolog and epilog. It is
2116    currently x86_64-specific.
2117``mustprogress``
2118    This attribute indicates that the function is required to return, unwind,
2119    or interact with the environment in an observable way e.g. via a volatile
2120    memory access, I/O, or other synchronization.  The ``mustprogress``
2121    attribute is intended to model the requirements of the first section of
2122    [intro.progress] of the C++ Standard. As a consequence, a loop in a
2123    function with the `mustprogress` attribute can be assumed to terminate if
2124    it does not interact with the environment in an observable way, and
2125    terminating loops without side-effects can be removed. If a `mustprogress`
2126    function does not satisfy this contract, the behavior is undefined.  This
2127    attribute does not apply transitively to callees, but does apply to call
2128    sites within the function. Note that `willreturn` implies `mustprogress`.
2129``"warn-stack-size"="<threshold>"``
2130    This attribute sets a threshold to emit diagnostics once the frame size is
2131    known should the frame size exceed the specified value.  It takes one
2132    required integer value, which should be a non-negative integer, and less
2133    than `UINT_MAX`.  It's unspecified which threshold will be used when
2134    duplicate definitions are linked together with differing values.
2135``vscale_range(<min>[, <max>])``
2136    This attribute indicates the minimum and maximum vscale value for the given
2137    function. A value of 0 means unbounded. If the optional max value is omitted
2138    then max is set to the value of min. If the attribute is not present, no
2139    assumptions are made about the range of vscale.
2140
2141Call Site Attributes
2142----------------------
2143
2144In addition to function attributes the following call site only
2145attributes are supported:
2146
2147``vector-function-abi-variant``
2148    This attribute can be attached to a :ref:`call <i_call>` to list
2149    the vector functions associated to the function. Notice that the
2150    attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2151    :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2152    comma separated list of mangled names. The order of the list does
2153    not imply preference (it is logically a set). The compiler is free
2154    to pick any listed vector function of its choosing.
2155
2156    The syntax for the mangled names is as follows:::
2157
2158        _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2159
2160    When present, the attribute informs the compiler that the function
2161    ``<scalar_name>`` has a corresponding vector variant that can be
2162    used to perform the concurrent invocation of ``<scalar_name>`` on
2163    vectors. The shape of the vector function is described by the
2164    tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2165    token. The standard name of the vector function is
2166    ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2167    the optional token ``(<vector_redirection>)`` informs the compiler
2168    that a custom name is provided in addition to the standard one
2169    (custom names can be provided for example via the use of ``declare
2170    variant`` in OpenMP 5.0). The declaration of the variant must be
2171    present in the IR Module. The signature of the vector variant is
2172    determined by the rules of the Vector Function ABI (VFABI)
2173    specifications of the target. For Arm and X86, the VFABI can be
2174    found at https://github.com/ARM-software/abi-aa and
2175    https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2176    respectively.
2177
2178    For X86 and Arm targets, the values of the tokens in the standard
2179    name are those that are defined in the VFABI. LLVM has an internal
2180    ``<isa>`` token that can be used to create scalar-to-vector
2181    mappings for functions that are not directly associated to any of
2182    the target ISAs (for example, some of the mappings stored in the
2183    TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2184
2185        <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2186              | n | s          -> Armv8 Advanced SIMD, SVE
2187              | __LLVM__       -> Internal LLVM Vector ISA
2188
2189    For all targets currently supported (x86, Arm and Internal LLVM),
2190    the remaining tokens can have the following values:::
2191
2192        <mask>:= M | N         -> mask | no mask
2193
2194        <vlen>:= number        -> number of lanes
2195               | x             -> VLA (Vector Length Agnostic)
2196
2197        <parameters>:= v              -> vector
2198                     | l | l <number> -> linear
2199                     | R | R <number> -> linear with ref modifier
2200                     | L | L <number> -> linear with val modifier
2201                     | U | U <number> -> linear with uval modifier
2202                     | ls <pos>       -> runtime linear
2203                     | Rs <pos>       -> runtime linear with ref modifier
2204                     | Ls <pos>       -> runtime linear with val modifier
2205                     | Us <pos>       -> runtime linear with uval modifier
2206                     | u              -> uniform
2207
2208        <scalar_name>:= name of the scalar function
2209
2210        <vector_redirection>:= optional, custom name of the vector function
2211
2212``preallocated(<ty>)``
2213    This attribute is required on calls to ``llvm.call.preallocated.arg``
2214    and cannot be used on any other call. See
2215    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2216    details.
2217
2218.. _glattrs:
2219
2220Global Attributes
2221-----------------
2222
2223Attributes may be set to communicate additional information about a global variable.
2224Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2225are grouped into a single :ref:`attribute group <attrgrp>`.
2226
2227.. _opbundles:
2228
2229Operand Bundles
2230---------------
2231
2232Operand bundles are tagged sets of SSA values that can be associated
2233with certain LLVM instructions (currently only ``call`` s and
2234``invoke`` s).  In a way they are like metadata, but dropping them is
2235incorrect and will change program semantics.
2236
2237Syntax::
2238
2239    operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2240    operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2241    bundle operand ::= SSA value
2242    tag ::= string constant
2243
2244Operand bundles are **not** part of a function's signature, and a
2245given function may be called from multiple places with different kinds
2246of operand bundles.  This reflects the fact that the operand bundles
2247are conceptually a part of the ``call`` (or ``invoke``), not the
2248callee being dispatched to.
2249
2250Operand bundles are a generic mechanism intended to support
2251runtime-introspection-like functionality for managed languages.  While
2252the exact semantics of an operand bundle depend on the bundle tag,
2253there are certain limitations to how much the presence of an operand
2254bundle can influence the semantics of a program.  These restrictions
2255are described as the semantics of an "unknown" operand bundle.  As
2256long as the behavior of an operand bundle is describable within these
2257restrictions, LLVM does not need to have special knowledge of the
2258operand bundle to not miscompile programs containing it.
2259
2260- The bundle operands for an unknown operand bundle escape in unknown
2261  ways before control is transferred to the callee or invokee.
2262- Calls and invokes with operand bundles have unknown read / write
2263  effect on the heap on entry and exit (even if the call target is
2264  ``readnone`` or ``readonly``), unless they're overridden with
2265  callsite specific attributes.
2266- An operand bundle at a call site cannot change the implementation
2267  of the called function.  Inter-procedural optimizations work as
2268  usual as long as they take into account the first two properties.
2269
2270More specific types of operand bundles are described below.
2271
2272.. _deopt_opbundles:
2273
2274Deoptimization Operand Bundles
2275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2276
2277Deoptimization operand bundles are characterized by the ``"deopt"``
2278operand bundle tag.  These operand bundles represent an alternate
2279"safe" continuation for the call site they're attached to, and can be
2280used by a suitable runtime to deoptimize the compiled frame at the
2281specified call site.  There can be at most one ``"deopt"`` operand
2282bundle attached to a call site.  Exact details of deoptimization is
2283out of scope for the language reference, but it usually involves
2284rewriting a compiled frame into a set of interpreted frames.
2285
2286From the compiler's perspective, deoptimization operand bundles make
2287the call sites they're attached to at least ``readonly``.  They read
2288through all of their pointer typed operands (even if they're not
2289otherwise escaped) and the entire visible heap.  Deoptimization
2290operand bundles do not capture their operands except during
2291deoptimization, in which case control will not be returned to the
2292compiled frame.
2293
2294The inliner knows how to inline through calls that have deoptimization
2295operand bundles.  Just like inlining through a normal call site
2296involves composing the normal and exceptional continuations, inlining
2297through a call site with a deoptimization operand bundle needs to
2298appropriately compose the "safe" deoptimization continuation.  The
2299inliner does this by prepending the parent's deoptimization
2300continuation to every deoptimization continuation in the inlined body.
2301E.g. inlining ``@f`` into ``@g`` in the following example
2302
2303.. code-block:: llvm
2304
2305    define void @f() {
2306      call void @x()  ;; no deopt state
2307      call void @y() [ "deopt"(i32 10) ]
2308      call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
2309      ret void
2310    }
2311
2312    define void @g() {
2313      call void @f() [ "deopt"(i32 20) ]
2314      ret void
2315    }
2316
2317will result in
2318
2319.. code-block:: llvm
2320
2321    define void @g() {
2322      call void @x()  ;; still no deopt state
2323      call void @y() [ "deopt"(i32 20, i32 10) ]
2324      call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
2325      ret void
2326    }
2327
2328It is the frontend's responsibility to structure or encode the
2329deoptimization state in a way that syntactically prepending the
2330caller's deoptimization state to the callee's deoptimization state is
2331semantically equivalent to composing the caller's deoptimization
2332continuation after the callee's deoptimization continuation.
2333
2334.. _ob_funclet:
2335
2336Funclet Operand Bundles
2337^^^^^^^^^^^^^^^^^^^^^^^
2338
2339Funclet operand bundles are characterized by the ``"funclet"``
2340operand bundle tag.  These operand bundles indicate that a call site
2341is within a particular funclet.  There can be at most one
2342``"funclet"`` operand bundle attached to a call site and it must have
2343exactly one bundle operand.
2344
2345If any funclet EH pads have been "entered" but not "exited" (per the
2346`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2347it is undefined behavior to execute a ``call`` or ``invoke`` which:
2348
2349* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2350  intrinsic, or
2351* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2352  not-yet-exited funclet EH pad.
2353
2354Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2355executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2356
2357GC Transition Operand Bundles
2358^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2359
2360GC transition operand bundles are characterized by the
2361``"gc-transition"`` operand bundle tag. These operand bundles mark a
2362call as a transition between a function with one GC strategy to a
2363function with a different GC strategy. If coordinating the transition
2364between GC strategies requires additional code generation at the call
2365site, these bundles may contain any values that are needed by the
2366generated code.  For more details, see :ref:`GC Transitions
2367<gc_transition_args>`.
2368
2369The bundle contain an arbitrary list of Values which need to be passed
2370to GC transition code. They will be lowered and passed as operands to
2371the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2372that these arguments must be available before and after (but not
2373necessarily during) the execution of the callee.
2374
2375.. _assume_opbundles:
2376
2377Assume Operand Bundles
2378^^^^^^^^^^^^^^^^^^^^^^
2379
2380Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2381assumptions that a :ref:`parameter attribute <paramattrs>` or a
2382:ref:`function attribute <fnattrs>` holds for a certain value at a certain
2383location. Operand bundles enable assumptions that are either hard or impossible
2384to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2385
2386An assume operand bundle has the form:
2387
2388::
2389
2390      "<tag>"([ <holds for value> [, <attribute argument>] ])
2391
2392* The tag of the operand bundle is usually the name of attribute that can be
2393  assumed to hold. It can also be `ignore`, this tag doesn't contain any
2394  information and should be ignored.
2395* The first argument if present is the value for which the attribute hold.
2396* The second argument if present is an argument of the attribute.
2397
2398If there are no arguments the attribute is a property of the call location.
2399
2400If the represented attribute expects a constant argument, the argument provided
2401to the operand bundle should be a constant as well.
2402
2403For example:
2404
2405.. code-block:: llvm
2406
2407      call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)]
2408
2409allows the optimizer to assume that at location of call to
2410:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2411
2412.. code-block:: llvm
2413
2414      call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)]
2415
2416allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2417call location is cold and that ``%val`` may not be null.
2418
2419Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2420provided guarantees are violated at runtime the behavior is undefined.
2421
2422Even if the assumed property can be encoded as a boolean value, like
2423``nonnull``, using operand bundles to express the property can still have
2424benefits:
2425
2426* Attributes that can be expressed via operand bundles are directly the
2427  property that the optimizer uses and cares about. Encoding attributes as
2428  operand bundles removes the need for an instruction sequence that represents
2429  the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the
2430  optimizer to deduce the property from that instruction sequence.
2431* Expressing the property using operand bundles makes it easy to identify the
2432  use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2433  simplifies and improves heuristics, e.g., for use "use-sensitive"
2434  optimizations.
2435
2436.. _ob_preallocated:
2437
2438Preallocated Operand Bundles
2439^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2440
2441Preallocated operand bundles are characterized by the ``"preallocated"``
2442operand bundle tag.  These operand bundles allow separation of the allocation
2443of the call argument memory from the call site.  This is necessary to pass
2444non-trivially copyable objects by value in a way that is compatible with MSVC
2445on some targets.  There can be at most one ``"preallocated"`` operand bundle
2446attached to a call site and it must have exactly one bundle operand, which is
2447a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2448operand bundle should not adjust the stack before entering the function, as
2449that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2450
2451.. code-block:: llvm
2452
2453      %foo = type { i64, i32 }
2454
2455      ...
2456
2457      %t = call token @llvm.call.preallocated.setup(i32 1)
2458      %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2459      %b = bitcast i8* %a to %foo*
2460      ; initialize %b
2461      call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)]
2462
2463.. _ob_gc_live:
2464
2465GC Live Operand Bundles
2466^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2467
2468A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2469intrinsic. The operand bundle must contain every pointer to a garbage collected
2470object which potentially needs to be updated by the garbage collector.
2471
2472When lowered, any relocated value will be recorded in the corresponding
2473:ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2474for further details.
2475
2476ObjC ARC Attached Call Operand Bundles
2477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2478
2479A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2480implicitly followed by a marker instruction and a call to an ObjC runtime
2481function that uses the result of the call. The operand bundle takes either the
2482pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2483``@objc_unsafeClaimAutoreleasedReturnValue``) or no arguments. If the bundle
2484doesn't take any arguments, only the marker instruction has to be emitted after
2485the call; the runtime function calls don't have to be emitted since they already
2486have been emitted. The return value of a call with this bundle is used by a call
2487to ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2488void, in which case the operand bundle is ignored.
2489
2490.. code-block:: llvm
2491
2492   ; The marker instruction and a runtime function call are inserted after the call
2493   ; to @foo.
2494   call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_retainAutoreleasedReturnValue) ]
2495   call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_unsafeClaimAutoreleasedReturnValue) ]
2496
2497   ; Only the marker instruction is inserted after the call to @foo.
2498   call i8* @foo() [ "clang.arc.attachedcall"() ]
2499
2500The operand bundle is needed to ensure the call is immediately followed by the
2501marker instruction or the ObjC runtime call in the final output.
2502
2503.. _moduleasm:
2504
2505Module-Level Inline Assembly
2506----------------------------
2507
2508Modules may contain "module-level inline asm" blocks, which corresponds
2509to the GCC "file scope inline asm" blocks. These blocks are internally
2510concatenated by LLVM and treated as a single unit, but may be separated
2511in the ``.ll`` file if desired. The syntax is very simple:
2512
2513.. code-block:: llvm
2514
2515    module asm "inline asm code goes here"
2516    module asm "more can go here"
2517
2518The strings can contain any character by escaping non-printable
2519characters. The escape sequence used is simply "\\xx" where "xx" is the
2520two digit hex code for the number.
2521
2522Note that the assembly string *must* be parseable by LLVM's integrated assembler
2523(unless it is disabled), even when emitting a ``.s`` file.
2524
2525.. _langref_datalayout:
2526
2527Data Layout
2528-----------
2529
2530A module may specify a target specific data layout string that specifies
2531how data is to be laid out in memory. The syntax for the data layout is
2532simply:
2533
2534.. code-block:: llvm
2535
2536    target datalayout = "layout specification"
2537
2538The *layout specification* consists of a list of specifications
2539separated by the minus sign character ('-'). Each specification starts
2540with a letter and may include other information after the letter to
2541define some aspect of the data layout. The specifications accepted are
2542as follows:
2543
2544``E``
2545    Specifies that the target lays out data in big-endian form. That is,
2546    the bits with the most significance have the lowest address
2547    location.
2548``e``
2549    Specifies that the target lays out data in little-endian form. That
2550    is, the bits with the least significance have the lowest address
2551    location.
2552``S<size>``
2553    Specifies the natural alignment of the stack in bits. Alignment
2554    promotion of stack variables is limited to the natural stack
2555    alignment to avoid dynamic stack realignment. The stack alignment
2556    must be a multiple of 8-bits. If omitted, the natural stack
2557    alignment defaults to "unspecified", which does not prevent any
2558    alignment promotions.
2559``P<address space>``
2560    Specifies the address space that corresponds to program memory.
2561    Harvard architectures can use this to specify what space LLVM
2562    should place things such as functions into. If omitted, the
2563    program memory space defaults to the default address space of 0,
2564    which corresponds to a Von Neumann architecture that has code
2565    and data in the same space.
2566``G<address space>``
2567    Specifies the address space to be used by default when creating global
2568    variables. If omitted, the globals address space defaults to the default
2569    address space 0.
2570    Note: variable declarations without an address space are always created in
2571    address space 0, this property only affects the default value to be used
2572    when creating globals without additional contextual information (e.g. in
2573    LLVM passes).
2574``A<address space>``
2575    Specifies the address space of objects created by '``alloca``'.
2576    Defaults to the default address space of 0.
2577``p[n]:<size>:<abi>:<pref>:<idx>``
2578    This specifies the *size* of a pointer and its ``<abi>`` and
2579    ``<pref>``\erred alignments for address space ``n``. The fourth parameter
2580    ``<idx>`` is a size of index that used for address calculation. If not
2581    specified, the default index size is equal to the pointer size. All sizes
2582    are in bits. The address space, ``n``, is optional, and if not specified,
2583    denotes the default address space 0. The value of ``n`` must be
2584    in the range [1,2^23).
2585``i<size>:<abi>:<pref>``
2586    This specifies the alignment for an integer type of a given bit
2587    ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2588``v<size>:<abi>:<pref>``
2589    This specifies the alignment for a vector type of a given bit
2590    ``<size>``.
2591``f<size>:<abi>:<pref>``
2592    This specifies the alignment for a floating-point type of a given bit
2593    ``<size>``. Only values of ``<size>`` that are supported by the target
2594    will work. 32 (float) and 64 (double) are supported on all targets; 80
2595    or 128 (different flavors of long double) are also supported on some
2596    targets.
2597``a:<abi>:<pref>``
2598    This specifies the alignment for an object of aggregate type.
2599``F<type><abi>``
2600    This specifies the alignment for function pointers.
2601    The options for ``<type>`` are:
2602
2603    * ``i``: The alignment of function pointers is independent of the alignment
2604      of functions, and is a multiple of ``<abi>``.
2605    * ``n``: The alignment of function pointers is a multiple of the explicit
2606      alignment specified on the function, and is a multiple of ``<abi>``.
2607``m:<mangling>``
2608    If present, specifies that llvm names are mangled in the output. Symbols
2609    prefixed with the mangling escape character ``\01`` are passed through
2610    directly to the assembler without the escape character. The mangling style
2611    options are
2612
2613    * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2614    * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2615    * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2616    * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2617      symbols get a ``_`` prefix.
2618    * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2619      Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2620      ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2621      ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2622      starting with ``?`` are not mangled in any way.
2623    * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2624      symbols do not receive a ``_`` prefix.
2625    * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2626``n<size1>:<size2>:<size3>...``
2627    This specifies a set of native integer widths for the target CPU in
2628    bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2629    ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2630    this set are considered to support most general arithmetic operations
2631    efficiently.
2632``ni:<address space0>:<address space1>:<address space2>...``
2633    This specifies pointer types with the specified address spaces
2634    as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2635    address space cannot be specified as non-integral.
2636
2637On every specification that takes a ``<abi>:<pref>``, specifying the
2638``<pref>`` alignment is optional. If omitted, the preceding ``:``
2639should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2640
2641When constructing the data layout for a given target, LLVM starts with a
2642default set of specifications which are then (possibly) overridden by
2643the specifications in the ``datalayout`` keyword. The default
2644specifications are given in this list:
2645
2646-  ``E`` - big endian
2647-  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2648-  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2649   same as the default address space.
2650-  ``S0`` - natural stack alignment is unspecified
2651-  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2652-  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2653-  ``i16:16:16`` - i16 is 16-bit aligned
2654-  ``i32:32:32`` - i32 is 32-bit aligned
2655-  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2656   alignment of 64-bits
2657-  ``f16:16:16`` - half is 16-bit aligned
2658-  ``f32:32:32`` - float is 32-bit aligned
2659-  ``f64:64:64`` - double is 64-bit aligned
2660-  ``f128:128:128`` - quad is 128-bit aligned
2661-  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2662-  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2663-  ``a:0:64`` - aggregates are 64-bit aligned
2664
2665When LLVM is determining the alignment for a given type, it uses the
2666following rules:
2667
2668#. If the type sought is an exact match for one of the specifications,
2669   that specification is used.
2670#. If no match is found, and the type sought is an integer type, then
2671   the smallest integer type that is larger than the bitwidth of the
2672   sought type is used. If none of the specifications are larger than
2673   the bitwidth then the largest integer type is used. For example,
2674   given the default specifications above, the i7 type will use the
2675   alignment of i8 (next largest) while both i65 and i256 will use the
2676   alignment of i64 (largest specified).
2677#. If no match is found, and the type sought is a vector type, then the
2678   largest vector type that is smaller than the sought vector type will
2679   be used as a fall back. This happens because <128 x double> can be
2680   implemented in terms of 64 <2 x double>, for example.
2681
2682The function of the data layout string may not be what you expect.
2683Notably, this is not a specification from the frontend of what alignment
2684the code generator should use.
2685
2686Instead, if specified, the target data layout is required to match what
2687the ultimate *code generator* expects. This string is used by the
2688mid-level optimizers to improve code, and this only works if it matches
2689what the ultimate code generator uses. There is no way to generate IR
2690that does not embed this target-specific detail into the IR. If you
2691don't specify the string, the default specifications will be used to
2692generate a Data Layout and the optimization phases will operate
2693accordingly and introduce target specificity into the IR with respect to
2694these default specifications.
2695
2696.. _langref_triple:
2697
2698Target Triple
2699-------------
2700
2701A module may specify a target triple string that describes the target
2702host. The syntax for the target triple is simply:
2703
2704.. code-block:: llvm
2705
2706    target triple = "x86_64-apple-macosx10.7.0"
2707
2708The *target triple* string consists of a series of identifiers delimited
2709by the minus sign character ('-'). The canonical forms are:
2710
2711::
2712
2713    ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2714    ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2715
2716This information is passed along to the backend so that it generates
2717code for the proper architecture. It's possible to override this on the
2718command line with the ``-mtriple`` command line option.
2719
2720.. _objectlifetime:
2721
2722Object Lifetime
2723----------------------
2724
2725A memory object, or simply object, is a region of a memory space that is
2726reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
2727allocation calls, and global variable definitions.
2728Once it is allocated, the bytes stored in the region can only be read or written
2729through a pointer that is :ref:`based on <pointeraliasing>` the allocation
2730value.
2731If a pointer that is not based on the object tries to read or write to the
2732object, it is undefined behavior.
2733
2734A lifetime of a memory object is a property that decides its accessibility.
2735Unless stated otherwise, a memory object is alive since its allocation, and
2736dead after its deallocation.
2737It is undefined behavior to access a memory object that isn't alive, but
2738operations that don't dereference it such as
2739:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
2740:ref:`icmp <i_icmp>` return a valid result.
2741This explains code motion of these instructions across operations that
2742impact the object's lifetime.
2743A stack object's lifetime can be explicitly specified using
2744:ref:`llvm.lifetime.start <int_lifestart>` and
2745:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
2746
2747.. _pointeraliasing:
2748
2749Pointer Aliasing Rules
2750----------------------
2751
2752Any memory access must be done through a pointer value associated with
2753an address range of the memory access, otherwise the behavior is
2754undefined. Pointer values are associated with address ranges according
2755to the following rules:
2756
2757-  A pointer value is associated with the addresses associated with any
2758   value it is *based* on.
2759-  An address of a global variable is associated with the address range
2760   of the variable's storage.
2761-  The result value of an allocation instruction is associated with the
2762   address range of the allocated storage.
2763-  A null pointer in the default address-space is associated with no
2764   address.
2765-  An :ref:`undef value <undefvalues>` in *any* address-space is
2766   associated with no address.
2767-  An integer constant other than zero or a pointer value returned from
2768   a function not defined within LLVM may be associated with address
2769   ranges allocated through mechanisms other than those provided by
2770   LLVM. Such ranges shall not overlap with any ranges of addresses
2771   allocated by mechanisms provided by LLVM.
2772
2773A pointer value is *based* on another pointer value according to the
2774following rules:
2775
2776-  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2777   the pointer-typed operand of the ``getelementptr``.
2778-  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2779   is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2780   of the ``getelementptr``.
2781-  The result value of a ``bitcast`` is *based* on the operand of the
2782   ``bitcast``.
2783-  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2784   values that contribute (directly or indirectly) to the computation of
2785   the pointer's value.
2786-  The "*based* on" relationship is transitive.
2787
2788Note that this definition of *"based"* is intentionally similar to the
2789definition of *"based"* in C99, though it is slightly weaker.
2790
2791LLVM IR does not associate types with memory. The result type of a
2792``load`` merely indicates the size and alignment of the memory from
2793which to load, as well as the interpretation of the value. The first
2794operand type of a ``store`` similarly only indicates the size and
2795alignment of the store.
2796
2797Consequently, type-based alias analysis, aka TBAA, aka
2798``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2799:ref:`Metadata <metadata>` may be used to encode additional information
2800which specialized optimization passes may use to implement type-based
2801alias analysis.
2802
2803.. _pointercapture:
2804
2805Pointer Capture
2806---------------
2807
2808Given a function call and a pointer that is passed as an argument or stored in
2809the memory before the call, a pointer is *captured* by the call if it makes a
2810copy of any part of the pointer that outlives the call.
2811To be precise, a pointer is captured if one or more of the following conditions
2812hold:
2813
28141. The call stores any bit of the pointer carrying information into a place,
2815   and the stored bits can be read from the place by the caller after this call
2816   exits.
2817
2818.. code-block:: llvm
2819
2820    @glb  = global i8* null
2821    @glb2 = global i8* null
2822    @glb3 = global i8* null
2823    @glbi = global i32 0
2824
2825    define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) {
2826      store i8* %a, i8** @glb ; %a is captured by this call
2827
2828      store i8* %b,   i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below
2829      store i8* null, i8** @glb2
2830
2831      store i8* %c,   i8** @glb3
2832      call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
2833      store i8* null, i8** @glb3
2834
2835      %i = ptrtoint i8* %d to i64
2836      %j = trunc i64 %i to i32
2837      store i32 %j, i32* @glbi ; %d is captured
2838
2839      ret i8* %e ; %e is captured
2840    }
2841
28422. The call stores any bit of the pointer carrying information into a place,
2843   and the stored bits can be safely read from the place by another thread via
2844   synchronization.
2845
2846.. code-block:: llvm
2847
2848    @lock = global i1 true
2849
2850    define void @f(i8* %a) {
2851      store i8* %a, i8** @glb
2852      store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb
2853      store i8* null, i8** @glb
2854      ret void
2855    }
2856
28573. The call's behavior depends on any bit of the pointer carrying information.
2858
2859.. code-block:: llvm
2860
2861    @glb = global i8 0
2862
2863    define void @f(i8* %a) {
2864      %c = icmp eq i8* %a, @glb
2865      br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
2866    BB_EXIT:
2867      call void @exit()
2868      unreachable
2869    BB_CONTINUE:
2870      ret void
2871    }
2872
28734. The pointer is used in a volatile access as its address.
2874
2875
2876.. _volatile:
2877
2878Volatile Memory Accesses
2879------------------------
2880
2881Certain memory accesses, such as :ref:`load <i_load>`'s,
2882:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2883marked ``volatile``. The optimizers must not change the number of
2884volatile operations or change their order of execution relative to other
2885volatile operations. The optimizers *may* change the order of volatile
2886operations relative to non-volatile operations. This is not Java's
2887"volatile" and has no cross-thread synchronization behavior.
2888
2889A volatile load or store may have additional target-specific semantics.
2890Any volatile operation can have side effects, and any volatile operation
2891can read and/or modify state which is not accessible via a regular load
2892or store in this module. Volatile operations may use addresses which do
2893not point to memory (like MMIO registers). This means the compiler may
2894not use a volatile operation to prove a non-volatile access to that
2895address has defined behavior.
2896
2897The allowed side-effects for volatile accesses are limited.  If a
2898non-volatile store to a given address would be legal, a volatile
2899operation may modify the memory at that address. A volatile operation
2900may not modify any other memory accessible by the module being compiled.
2901A volatile operation may not call any code in the current module.
2902
2903The compiler may assume execution will continue after a volatile operation,
2904so operations which modify memory or may have undefined behavior can be
2905hoisted past a volatile operation.
2906
2907As an exception to the preceding rule, the compiler may not assume execution
2908will continue after a volatile store operation. This restriction is necessary
2909to support the somewhat common pattern in C of intentionally storing to an
2910invalid pointer to crash the program. In the future, it might make sense to
2911allow frontends to control this behavior.
2912
2913IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
2914or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
2915Likewise, the backend should never split or merge target-legal volatile
2916load/store instructions. Similarly, IR-level volatile loads and stores cannot
2917change from integer to floating-point or vice versa.
2918
2919.. admonition:: Rationale
2920
2921 Platforms may rely on volatile loads and stores of natively supported
2922 data width to be executed as single instruction. For example, in C
2923 this holds for an l-value of volatile primitive type with native
2924 hardware support, but not necessarily for aggregate types. The
2925 frontend upholds these expectations, which are intentionally
2926 unspecified in the IR. The rules above ensure that IR transformations
2927 do not violate the frontend's contract with the language.
2928
2929.. _memmodel:
2930
2931Memory Model for Concurrent Operations
2932--------------------------------------
2933
2934The LLVM IR does not define any way to start parallel threads of
2935execution or to register signal handlers. Nonetheless, there are
2936platform-specific ways to create them, and we define LLVM IR's behavior
2937in their presence. This model is inspired by the C++0x memory model.
2938
2939For a more informal introduction to this model, see the :doc:`Atomics`.
2940
2941We define a *happens-before* partial order as the least partial order
2942that
2943
2944-  Is a superset of single-thread program order, and
2945-  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2946   ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2947   techniques, like pthread locks, thread creation, thread joining,
2948   etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2949   Constraints <ordering>`).
2950
2951Note that program order does not introduce *happens-before* edges
2952between a thread and signals executing inside that thread.
2953
2954Every (defined) read operation (load instructions, memcpy, atomic
2955loads/read-modify-writes, etc.) R reads a series of bytes written by
2956(defined) write operations (store instructions, atomic
2957stores/read-modify-writes, memcpy, etc.). For the purposes of this
2958section, initialized globals are considered to have a write of the
2959initializer which is atomic and happens before any other read or write
2960of the memory in question. For each byte of a read R, R\ :sub:`byte`
2961may see any write to the same byte, except:
2962
2963-  If write\ :sub:`1`  happens before write\ :sub:`2`, and
2964   write\ :sub:`2` happens before R\ :sub:`byte`, then
2965   R\ :sub:`byte` does not see write\ :sub:`1`.
2966-  If R\ :sub:`byte` happens before write\ :sub:`3`, then
2967   R\ :sub:`byte` does not see write\ :sub:`3`.
2968
2969Given that definition, R\ :sub:`byte` is defined as follows:
2970
2971-  If R is volatile, the result is target-dependent. (Volatile is
2972   supposed to give guarantees which can support ``sig_atomic_t`` in
2973   C/C++, and may be used for accesses to addresses that do not behave
2974   like normal memory. It does not generally provide cross-thread
2975   synchronization.)
2976-  Otherwise, if there is no write to the same byte that happens before
2977   R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2978-  Otherwise, if R\ :sub:`byte` may see exactly one write,
2979   R\ :sub:`byte` returns the value written by that write.
2980-  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2981   see are atomic, it chooses one of the values written. See the :ref:`Atomic
2982   Memory Ordering Constraints <ordering>` section for additional
2983   constraints on how the choice is made.
2984-  Otherwise R\ :sub:`byte` returns ``undef``.
2985
2986R returns the value composed of the series of bytes it read. This
2987implies that some bytes within the value may be ``undef`` **without**
2988the entire value being ``undef``. Note that this only defines the
2989semantics of the operation; it doesn't mean that targets will emit more
2990than one instruction to read the series of bytes.
2991
2992Note that in cases where none of the atomic intrinsics are used, this
2993model places only one restriction on IR transformations on top of what
2994is required for single-threaded execution: introducing a store to a byte
2995which might not otherwise be stored is not allowed in general.
2996(Specifically, in the case where another thread might write to and read
2997from an address, introducing a store can change a load that may see
2998exactly one write into a load that may see multiple writes.)
2999
3000.. _ordering:
3001
3002Atomic Memory Ordering Constraints
3003----------------------------------
3004
3005Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3006:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3007:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3008ordering parameters that determine which other atomic instructions on
3009the same address they *synchronize with*. These semantics are borrowed
3010from Java and C++0x, but are somewhat more colloquial. If these
3011descriptions aren't precise enough, check those specs (see spec
3012references in the :doc:`atomics guide <Atomics>`).
3013:ref:`fence <i_fence>` instructions treat these orderings somewhat
3014differently since they don't take an address. See that instruction's
3015documentation for details.
3016
3017For a simpler introduction to the ordering constraints, see the
3018:doc:`Atomics`.
3019
3020``unordered``
3021    The set of values that can be read is governed by the happens-before
3022    partial order. A value cannot be read unless some operation wrote
3023    it. This is intended to provide a guarantee strong enough to model
3024    Java's non-volatile shared variables. This ordering cannot be
3025    specified for read-modify-write operations; it is not strong enough
3026    to make them atomic in any interesting way.
3027``monotonic``
3028    In addition to the guarantees of ``unordered``, there is a single
3029    total order for modifications by ``monotonic`` operations on each
3030    address. All modification orders must be compatible with the
3031    happens-before order. There is no guarantee that the modification
3032    orders can be combined to a global total order for the whole program
3033    (and this often will not be possible). The read in an atomic
3034    read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3035    :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3036    order immediately before the value it writes. If one atomic read
3037    happens before another atomic read of the same address, the later
3038    read must see the same value or a later value in the address's
3039    modification order. This disallows reordering of ``monotonic`` (or
3040    stronger) operations on the same address. If an address is written
3041    ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3042    read that address repeatedly, the other threads must eventually see
3043    the write. This corresponds to the C++0x/C1x
3044    ``memory_order_relaxed``.
3045``acquire``
3046    In addition to the guarantees of ``monotonic``, a
3047    *synchronizes-with* edge may be formed with a ``release`` operation.
3048    This is intended to model C++'s ``memory_order_acquire``.
3049``release``
3050    In addition to the guarantees of ``monotonic``, if this operation
3051    writes a value which is subsequently read by an ``acquire``
3052    operation, it *synchronizes-with* that operation. (This isn't a
3053    complete description; see the C++0x definition of a release
3054    sequence.) This corresponds to the C++0x/C1x
3055    ``memory_order_release``.
3056``acq_rel`` (acquire+release)
3057    Acts as both an ``acquire`` and ``release`` operation on its
3058    address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3059``seq_cst`` (sequentially consistent)
3060    In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3061    operation that only reads, ``release`` for an operation that only
3062    writes), there is a global total order on all
3063    sequentially-consistent operations on all addresses, which is
3064    consistent with the *happens-before* partial order and with the
3065    modification orders of all the affected addresses. Each
3066    sequentially-consistent read sees the last preceding write to the
3067    same address in this global order. This corresponds to the C++0x/C1x
3068    ``memory_order_seq_cst`` and Java volatile.
3069
3070.. _syncscope:
3071
3072If an atomic operation is marked ``syncscope("singlethread")``, it only
3073*synchronizes with* and only participates in the seq\_cst total orderings of
3074other operations running in the same thread (for example, in signal handlers).
3075
3076If an atomic operation is marked ``syncscope("<target-scope>")``, where
3077``<target-scope>`` is a target specific synchronization scope, then it is target
3078dependent if it *synchronizes with* and participates in the seq\_cst total
3079orderings of other operations.
3080
3081Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3082or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3083seq\_cst total orderings of other operations that are not marked
3084``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3085
3086.. _floatenv:
3087
3088Floating-Point Environment
3089--------------------------
3090
3091The default LLVM floating-point environment assumes that floating-point
3092instructions do not have side effects. Results assume the round-to-nearest
3093rounding mode. No floating-point exception state is maintained in this
3094environment. Therefore, there is no attempt to create or preserve invalid
3095operation (SNaN) or division-by-zero exceptions.
3096
3097The benefit of this exception-free assumption is that floating-point
3098operations may be speculated freely without any other fast-math relaxations
3099to the floating-point model.
3100
3101Code that requires different behavior than this should use the
3102:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3103
3104.. _fastmath:
3105
3106Fast-Math Flags
3107---------------
3108
3109LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3110:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3111:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3112:ref:`select <i_select>` and :ref:`call <i_call>`
3113may use the following flags to enable otherwise unsafe
3114floating-point transformations.
3115
3116``nnan``
3117   No NaNs - Allow optimizations to assume the arguments and result are not
3118   NaN. If an argument is a nan, or the result would be a nan, it produces
3119   a :ref:`poison value <poisonvalues>` instead.
3120
3121``ninf``
3122   No Infs - Allow optimizations to assume the arguments and result are not
3123   +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3124   produces a :ref:`poison value <poisonvalues>` instead.
3125
3126``nsz``
3127   No Signed Zeros - Allow optimizations to treat the sign of a zero
3128   argument or result as insignificant. This does not imply that -0.0
3129   is poison and/or guaranteed to not exist in the operation.
3130
3131``arcp``
3132   Allow Reciprocal - Allow optimizations to use the reciprocal of an
3133   argument rather than perform division.
3134
3135``contract``
3136   Allow floating-point contraction (e.g. fusing a multiply followed by an
3137   addition into a fused multiply-and-add). This does not enable reassociating
3138   to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3139   be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3140
3141``afn``
3142   Approximate functions - Allow substitution of approximate calculations for
3143   functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3144   for places where this can apply to LLVM's intrinsic math functions.
3145
3146``reassoc``
3147   Allow reassociation transformations for floating-point instructions.
3148   This may dramatically change results in floating-point.
3149
3150``fast``
3151   This flag implies all of the others.
3152
3153.. _uselistorder:
3154
3155Use-list Order Directives
3156-------------------------
3157
3158Use-list directives encode the in-memory order of each use-list, allowing the
3159order to be recreated. ``<order-indexes>`` is a comma-separated list of
3160indexes that are assigned to the referenced value's uses. The referenced
3161value's use-list is immediately sorted by these indexes.
3162
3163Use-list directives may appear at function scope or global scope. They are not
3164instructions, and have no effect on the semantics of the IR. When they're at
3165function scope, they must appear after the terminator of the final basic block.
3166
3167If basic blocks have their address taken via ``blockaddress()`` expressions,
3168``uselistorder_bb`` can be used to reorder their use-lists from outside their
3169function's scope.
3170
3171:Syntax:
3172
3173::
3174
3175    uselistorder <ty> <value>, { <order-indexes> }
3176    uselistorder_bb @function, %block { <order-indexes> }
3177
3178:Examples:
3179
3180::
3181
3182    define void @foo(i32 %arg1, i32 %arg2) {
3183    entry:
3184      ; ... instructions ...
3185    bb:
3186      ; ... instructions ...
3187
3188      ; At function scope.
3189      uselistorder i32 %arg1, { 1, 0, 2 }
3190      uselistorder label %bb, { 1, 0 }
3191    }
3192
3193    ; At global scope.
3194    uselistorder i32* @global, { 1, 2, 0 }
3195    uselistorder i32 7, { 1, 0 }
3196    uselistorder i32 (i32) @bar, { 1, 0 }
3197    uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3198
3199.. _source_filename:
3200
3201Source Filename
3202---------------
3203
3204The *source filename* string is set to the original module identifier,
3205which will be the name of the compiled source file when compiling from
3206source through the clang front end, for example. It is then preserved through
3207the IR and bitcode.
3208
3209This is currently necessary to generate a consistent unique global
3210identifier for local functions used in profile data, which prepends the
3211source file name to the local function name.
3212
3213The syntax for the source file name is simply:
3214
3215.. code-block:: text
3216
3217    source_filename = "/path/to/source.c"
3218
3219.. _typesystem:
3220
3221Type System
3222===========
3223
3224The LLVM type system is one of the most important features of the
3225intermediate representation. Being typed enables a number of
3226optimizations to be performed on the intermediate representation
3227directly, without having to do extra analyses on the side before the
3228transformation. A strong type system makes it easier to read the
3229generated code and enables novel analyses and transformations that are
3230not feasible to perform on normal three address code representations.
3231
3232.. _t_void:
3233
3234Void Type
3235---------
3236
3237:Overview:
3238
3239
3240The void type does not represent any value and has no size.
3241
3242:Syntax:
3243
3244
3245::
3246
3247      void
3248
3249
3250.. _t_function:
3251
3252Function Type
3253-------------
3254
3255:Overview:
3256
3257
3258The function type can be thought of as a function signature. It consists of a
3259return type and a list of formal parameter types. The return type of a function
3260type is a void type or first class type --- except for :ref:`label <t_label>`
3261and :ref:`metadata <t_metadata>` types.
3262
3263:Syntax:
3264
3265::
3266
3267      <returntype> (<parameter list>)
3268
3269...where '``<parameter list>``' is a comma-separated list of type
3270specifiers. Optionally, the parameter list may include a type ``...``, which
3271indicates that the function takes a variable number of arguments. Variable
3272argument functions can access their arguments with the :ref:`variable argument
3273handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3274except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3275
3276:Examples:
3277
3278+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3279| ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
3280+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3281| ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
3282+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3283| ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
3284+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3285| ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
3286+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3287
3288.. _t_firstclass:
3289
3290First Class Types
3291-----------------
3292
3293The :ref:`first class <t_firstclass>` types are perhaps the most important.
3294Values of these types are the only ones which can be produced by
3295instructions.
3296
3297.. _t_single_value:
3298
3299Single Value Types
3300^^^^^^^^^^^^^^^^^^
3301
3302These are the types that are valid in registers from CodeGen's perspective.
3303
3304.. _t_integer:
3305
3306Integer Type
3307""""""""""""
3308
3309:Overview:
3310
3311The integer type is a very simple type that simply specifies an
3312arbitrary bit width for the integer type desired. Any bit width from 1
3313bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3314
3315:Syntax:
3316
3317::
3318
3319      iN
3320
3321The number of bits the integer will occupy is specified by the ``N``
3322value.
3323
3324Examples:
3325*********
3326
3327+----------------+------------------------------------------------+
3328| ``i1``         | a single-bit integer.                          |
3329+----------------+------------------------------------------------+
3330| ``i32``        | a 32-bit integer.                              |
3331+----------------+------------------------------------------------+
3332| ``i1942652``   | a really big integer of over 1 million bits.   |
3333+----------------+------------------------------------------------+
3334
3335.. _t_floating:
3336
3337Floating-Point Types
3338""""""""""""""""""""
3339
3340.. list-table::
3341   :header-rows: 1
3342
3343   * - Type
3344     - Description
3345
3346   * - ``half``
3347     - 16-bit floating-point value
3348
3349   * - ``bfloat``
3350     - 16-bit "brain" floating-point value (7-bit significand).  Provides the
3351       same number of exponent bits as ``float``, so that it matches its dynamic
3352       range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
3353       extensions and Arm's ARMv8.6-A extensions, among others.
3354
3355   * - ``float``
3356     - 32-bit floating-point value
3357
3358   * - ``double``
3359     - 64-bit floating-point value
3360
3361   * - ``fp128``
3362     - 128-bit floating-point value (113-bit significand)
3363
3364   * - ``x86_fp80``
3365     -  80-bit floating-point value (X87)
3366
3367   * - ``ppc_fp128``
3368     - 128-bit floating-point value (two 64-bits)
3369
3370The binary format of half, float, double, and fp128 correspond to the
3371IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3372respectively.
3373
3374X86_amx Type
3375""""""""""""
3376
3377:Overview:
3378
3379The x86_amx type represents a value held in an AMX tile register on an x86
3380machine. The operations allowed on it are quite limited. Only few intrinsics
3381are allowed: stride load and store, zero and dot product. No instruction is
3382allowed for this type. There are no arguments, arrays, pointers, vectors
3383or constants of this type.
3384
3385:Syntax:
3386
3387::
3388
3389      x86_amx
3390
3391
3392X86_mmx Type
3393""""""""""""
3394
3395:Overview:
3396
3397The x86_mmx type represents a value held in an MMX register on an x86
3398machine. The operations allowed on it are quite limited: parameters and
3399return values, load and store, and bitcast. User-specified MMX
3400instructions are represented as intrinsic or asm calls with arguments
3401and/or results of this type. There are no arrays, vectors or constants
3402of this type.
3403
3404:Syntax:
3405
3406::
3407
3408      x86_mmx
3409
3410
3411.. _t_pointer:
3412
3413Pointer Type
3414""""""""""""
3415
3416:Overview:
3417
3418The pointer type is used to specify memory locations. Pointers are
3419commonly used to reference objects in memory.
3420
3421Pointer types may have an optional address space attribute defining the
3422numbered address space where the pointed-to object resides. The default
3423address space is number zero. The semantics of non-zero address spaces
3424are target-specific.
3425
3426Note that LLVM does not permit pointers to void (``void*``) nor does it
3427permit pointers to labels (``label*``). Use ``i8*`` instead.
3428
3429LLVM is in the process of transitioning to
3430`opaque pointers <OpaquePointers.html#opaque-pointers>`_.
3431Opaque pointers do not have a pointee type. Rather, instructions
3432interacting through pointers specify the type of the underlying memory
3433they are interacting with. Opaque pointers are still in the process of
3434being worked on and are not complete.
3435
3436:Syntax:
3437
3438::
3439
3440      <type> *
3441      ptr
3442
3443:Examples:
3444
3445+-------------------------+--------------------------------------------------------------------------------------------------------------+
3446| ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
3447+-------------------------+--------------------------------------------------------------------------------------------------------------+
3448| ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
3449+-------------------------+--------------------------------------------------------------------------------------------------------------+
3450| ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5.                            |
3451+-------------------------+--------------------------------------------------------------------------------------------------------------+
3452| ``ptr``                 | An opaque pointer type to a value that resides in address space 0.                                           |
3453+-------------------------+--------------------------------------------------------------------------------------------------------------+
3454| ``ptr addrspace(5)``    | An opaque pointer type to a value that resides in address space 5.                                           |
3455+-------------------------+--------------------------------------------------------------------------------------------------------------+
3456
3457.. _t_vector:
3458
3459Vector Type
3460"""""""""""
3461
3462:Overview:
3463
3464A vector type is a simple derived type that represents a vector of
3465elements. Vector types are used when multiple primitive data are
3466operated in parallel using a single instruction (SIMD). A vector type
3467requires a size (number of elements), an underlying primitive data type,
3468and a scalable property to represent vectors where the exact hardware
3469vector length is unknown at compile time. Vector types are considered
3470:ref:`first class <t_firstclass>`.
3471
3472:Memory Layout:
3473
3474In general vector elements are laid out in memory in the same way as
3475:ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3476elements are byte sized. However, when the elements of the vector aren't byte
3477sized it gets a bit more complicated. One way to describe the layout is by
3478describing what happens when a vector such as <N x iM> is bitcasted to an
3479integer type with N*M bits, and then following the rules for storing such an
3480integer to memory.
3481
3482A bitcast from a vector type to a scalar integer type will see the elements
3483being packed together (without padding). The order in which elements are
3484inserted in the integer depends on endianess. For little endian element zero
3485is put in the least significant bits of the integer, and for big endian
3486element zero is put in the most significant bits.
3487
3488Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3489with the analogy that we can replace a vector store by a bitcast followed by
3490an integer store, we get this for big endian:
3491
3492.. code-block:: llvm
3493
3494      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3495
3496      ; Bitcasting from a vector to an integral type can be seen as
3497      ; concatenating the values:
3498      ;   %val now has the hexadecimal value 0x1235.
3499
3500      store i16 %val, i16* %ptr
3501
3502      ; In memory the content will be (8-bit addressing):
3503      ;
3504      ;    [%ptr + 0]: 00010010  (0x12)
3505      ;    [%ptr + 1]: 00110101  (0x35)
3506
3507The same example for little endian:
3508
3509.. code-block:: llvm
3510
3511      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3512
3513      ; Bitcasting from a vector to an integral type can be seen as
3514      ; concatenating the values:
3515      ;   %val now has the hexadecimal value 0x5321.
3516
3517      store i16 %val, i16* %ptr
3518
3519      ; In memory the content will be (8-bit addressing):
3520      ;
3521      ;    [%ptr + 0]: 01010011  (0x53)
3522      ;    [%ptr + 1]: 00100001  (0x21)
3523
3524When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3525is unspecified (just like it is for an integral type of the same size). This
3526is because different targets could put the padding at different positions when
3527the type size is smaller than the type's store size.
3528
3529:Syntax:
3530
3531::
3532
3533      < <# elements> x <elementtype> >          ; Fixed-length vector
3534      < vscale x <# elements> x <elementtype> > ; Scalable vector
3535
3536The number of elements is a constant integer value larger than 0;
3537elementtype may be any integer, floating-point or pointer type. Vectors
3538of size zero are not allowed. For scalable vectors, the total number of
3539elements is a constant multiple (called vscale) of the specified number
3540of elements; vscale is a positive integer that is unknown at compile time
3541and the same hardware-dependent constant for all scalable vectors at run
3542time. The size of a specific scalable vector type is thus constant within
3543IR, even if the exact size in bytes cannot be determined until run time.
3544
3545:Examples:
3546
3547+------------------------+----------------------------------------------------+
3548| ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
3549+------------------------+----------------------------------------------------+
3550| ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
3551+------------------------+----------------------------------------------------+
3552| ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
3553+------------------------+----------------------------------------------------+
3554| ``<4 x i64*>``         | Vector of 4 pointers to 64-bit integer values.     |
3555+------------------------+----------------------------------------------------+
3556| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3557+------------------------+----------------------------------------------------+
3558
3559.. _t_label:
3560
3561Label Type
3562^^^^^^^^^^
3563
3564:Overview:
3565
3566The label type represents code labels.
3567
3568:Syntax:
3569
3570::
3571
3572      label
3573
3574.. _t_token:
3575
3576Token Type
3577^^^^^^^^^^
3578
3579:Overview:
3580
3581The token type is used when a value is associated with an instruction
3582but all uses of the value must not attempt to introspect or obscure it.
3583As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3584:ref:`select <i_select>` of type token.
3585
3586:Syntax:
3587
3588::
3589
3590      token
3591
3592
3593
3594.. _t_metadata:
3595
3596Metadata Type
3597^^^^^^^^^^^^^
3598
3599:Overview:
3600
3601The metadata type represents embedded metadata. No derived types may be
3602created from metadata except for :ref:`function <t_function>` arguments.
3603
3604:Syntax:
3605
3606::
3607
3608      metadata
3609
3610.. _t_aggregate:
3611
3612Aggregate Types
3613^^^^^^^^^^^^^^^
3614
3615Aggregate Types are a subset of derived types that can contain multiple
3616member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3617aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3618aggregate types.
3619
3620.. _t_array:
3621
3622Array Type
3623""""""""""
3624
3625:Overview:
3626
3627The array type is a very simple derived type that arranges elements
3628sequentially in memory. The array type requires a size (number of
3629elements) and an underlying data type.
3630
3631:Syntax:
3632
3633::
3634
3635      [<# elements> x <elementtype>]
3636
3637The number of elements is a constant integer value; ``elementtype`` may
3638be any type with a size.
3639
3640:Examples:
3641
3642+------------------+--------------------------------------+
3643| ``[40 x i32]``   | Array of 40 32-bit integer values.   |
3644+------------------+--------------------------------------+
3645| ``[41 x i32]``   | Array of 41 32-bit integer values.   |
3646+------------------+--------------------------------------+
3647| ``[4 x i8]``     | Array of 4 8-bit integer values.     |
3648+------------------+--------------------------------------+
3649
3650Here are some examples of multidimensional arrays:
3651
3652+-----------------------------+----------------------------------------------------------+
3653| ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
3654+-----------------------------+----------------------------------------------------------+
3655| ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
3656+-----------------------------+----------------------------------------------------------+
3657| ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
3658+-----------------------------+----------------------------------------------------------+
3659
3660There is no restriction on indexing beyond the end of the array implied
3661by a static type (though there are restrictions on indexing beyond the
3662bounds of an allocated object in some cases). This means that
3663single-dimension 'variable sized array' addressing can be implemented in
3664LLVM with a zero length array type. An implementation of 'pascal style
3665arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3666example.
3667
3668.. _t_struct:
3669
3670Structure Type
3671""""""""""""""
3672
3673:Overview:
3674
3675The structure type is used to represent a collection of data members
3676together in memory. The elements of a structure may be any type that has
3677a size.
3678
3679Structures in memory are accessed using '``load``' and '``store``' by
3680getting a pointer to a field with the '``getelementptr``' instruction.
3681Structures in registers are accessed using the '``extractvalue``' and
3682'``insertvalue``' instructions.
3683
3684Structures may optionally be "packed" structures, which indicate that
3685the alignment of the struct is one byte, and that there is no padding
3686between the elements. In non-packed structs, padding between field types
3687is inserted as defined by the DataLayout string in the module, which is
3688required to match what the underlying code generator expects.
3689
3690Structures can either be "literal" or "identified". A literal structure
3691is defined inline with other types (e.g. ``{i32, i32}*``) whereas
3692identified types are always defined at the top level with a name.
3693Literal types are uniqued by their contents and can never be recursive
3694or opaque since there is no way to write one. Identified types can be
3695recursive, can be opaqued, and are never uniqued.
3696
3697:Syntax:
3698
3699::
3700
3701      %T1 = type { <type list> }     ; Identified normal struct type
3702      %T2 = type <{ <type list> }>   ; Identified packed struct type
3703
3704:Examples:
3705
3706+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3707| ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
3708+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3709| ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
3710+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3711| ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
3712+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3713
3714.. _t_opaque:
3715
3716Opaque Structure Types
3717""""""""""""""""""""""
3718
3719:Overview:
3720
3721Opaque structure types are used to represent structure types that
3722do not have a body specified. This corresponds (for example) to the C
3723notion of a forward declared structure. They can be named (``%X``) or
3724unnamed (``%52``).
3725
3726:Syntax:
3727
3728::
3729
3730      %X = type opaque
3731      %52 = type opaque
3732
3733:Examples:
3734
3735+--------------+-------------------+
3736| ``opaque``   | An opaque type.   |
3737+--------------+-------------------+
3738
3739.. _constants:
3740
3741Constants
3742=========
3743
3744LLVM has several different basic types of constants. This section
3745describes them all and their syntax.
3746
3747Simple Constants
3748----------------
3749
3750**Boolean constants**
3751    The two strings '``true``' and '``false``' are both valid constants
3752    of the ``i1`` type.
3753**Integer constants**
3754    Standard integers (such as '4') are constants of the
3755    :ref:`integer <t_integer>` type. Negative numbers may be used with
3756    integer types.
3757**Floating-point constants**
3758    Floating-point constants use standard decimal notation (e.g.
3759    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
3760    hexadecimal notation (see below). The assembler requires the exact
3761    decimal value of a floating-point constant. For example, the
3762    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
3763    decimal in binary. Floating-point constants must have a
3764    :ref:`floating-point <t_floating>` type.
3765**Null pointer constants**
3766    The identifier '``null``' is recognized as a null pointer constant
3767    and must be of :ref:`pointer type <t_pointer>`.
3768**Token constants**
3769    The identifier '``none``' is recognized as an empty token constant
3770    and must be of :ref:`token type <t_token>`.
3771
3772The one non-intuitive notation for constants is the hexadecimal form of
3773floating-point constants. For example, the form
3774'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
3775than) '``double 4.5e+15``'. The only time hexadecimal floating-point
3776constants are required (and the only time that they are generated by the
3777disassembler) is when a floating-point constant must be emitted but it
3778cannot be represented as a decimal floating-point number in a reasonable
3779number of digits. For example, NaN's, infinities, and other special
3780values are represented in their IEEE hexadecimal format so that assembly
3781and disassembly do not cause any bits to change in the constants.
3782
3783When using the hexadecimal form, constants of types bfloat, half, float, and
3784double are represented using the 16-digit form shown above (which matches the
3785IEEE754 representation for double); bfloat, half and float values must, however,
3786be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
3787precision respectively. Hexadecimal format is always used for long double, and
3788there are three forms of long double. The 80-bit format used by x86 is
3789represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
3790used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
3791hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
3792by 32 hexadecimal digits. Long doubles will only work if they match the long
3793double format on your target.  The IEEE 16-bit format (half precision) is
3794represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
3795format is represented by ``0xR`` followed by 4 hexadecimal digits. All
3796hexadecimal formats are big-endian (sign bit at the left).
3797
3798There are no constants of type x86_mmx and x86_amx.
3799
3800.. _complexconstants:
3801
3802Complex Constants
3803-----------------
3804
3805Complex constants are a (potentially recursive) combination of simple
3806constants and smaller complex constants.
3807
3808**Structure constants**
3809    Structure constants are represented with notation similar to
3810    structure type definitions (a comma separated list of elements,
3811    surrounded by braces (``{}``)). For example:
3812    "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
3813    "``@G = external global i32``". Structure constants must have
3814    :ref:`structure type <t_struct>`, and the number and types of elements
3815    must match those specified by the type.
3816**Array constants**
3817    Array constants are represented with notation similar to array type
3818    definitions (a comma separated list of elements, surrounded by
3819    square brackets (``[]``)). For example:
3820    "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
3821    :ref:`array type <t_array>`, and the number and types of elements must
3822    match those specified by the type. As a special case, character array
3823    constants may also be represented as a double-quoted string using the ``c``
3824    prefix. For example: "``c"Hello World\0A\00"``".
3825**Vector constants**
3826    Vector constants are represented with notation similar to vector
3827    type definitions (a comma separated list of elements, surrounded by
3828    less-than/greater-than's (``<>``)). For example:
3829    "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
3830    must have :ref:`vector type <t_vector>`, and the number and types of
3831    elements must match those specified by the type.
3832**Zero initialization**
3833    The string '``zeroinitializer``' can be used to zero initialize a
3834    value to zero of *any* type, including scalar and
3835    :ref:`aggregate <t_aggregate>` types. This is often used to avoid
3836    having to print large zero initializers (e.g. for large arrays) and
3837    is always exactly equivalent to using explicit zero initializers.
3838**Metadata node**
3839    A metadata node is a constant tuple without types. For example:
3840    "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
3841    for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
3842    Unlike other typed constants that are meant to be interpreted as part of
3843    the instruction stream, metadata is a place to attach additional
3844    information such as debug info.
3845
3846Global Variable and Function Addresses
3847--------------------------------------
3848
3849The addresses of :ref:`global variables <globalvars>` and
3850:ref:`functions <functionstructure>` are always implicitly valid
3851(link-time) constants. These constants are explicitly referenced when
3852the :ref:`identifier for the global <identifiers>` is used and always have
3853:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
3854file:
3855
3856.. code-block:: llvm
3857
3858    @X = global i32 17
3859    @Y = global i32 42
3860    @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
3861
3862.. _undefvalues:
3863
3864Undefined Values
3865----------------
3866
3867The string '``undef``' can be used anywhere a constant is expected, and
3868indicates that the user of the value may receive an unspecified
3869bit-pattern. Undefined values may be of any type (other than '``label``'
3870or '``void``') and be used anywhere a constant is permitted.
3871
3872Undefined values are useful because they indicate to the compiler that
3873the program is well defined no matter what value is used. This gives the
3874compiler more freedom to optimize. Here are some examples of
3875(potentially surprising) transformations that are valid (in pseudo IR):
3876
3877.. code-block:: llvm
3878
3879      %A = add %X, undef
3880      %B = sub %X, undef
3881      %C = xor %X, undef
3882    Safe:
3883      %A = undef
3884      %B = undef
3885      %C = undef
3886
3887This is safe because all of the output bits are affected by the undef
3888bits. Any output bit can have a zero or one depending on the input bits.
3889
3890.. code-block:: llvm
3891
3892      %A = or %X, undef
3893      %B = and %X, undef
3894    Safe:
3895      %A = -1
3896      %B = 0
3897    Safe:
3898      %A = %X  ;; By choosing undef as 0
3899      %B = %X  ;; By choosing undef as -1
3900    Unsafe:
3901      %A = undef
3902      %B = undef
3903
3904These logical operations have bits that are not always affected by the
3905input. For example, if ``%X`` has a zero bit, then the output of the
3906'``and``' operation will always be a zero for that bit, no matter what
3907the corresponding bit from the '``undef``' is. As such, it is unsafe to
3908optimize or assume that the result of the '``and``' is '``undef``'.
3909However, it is safe to assume that all bits of the '``undef``' could be
39100, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3911all the bits of the '``undef``' operand to the '``or``' could be set,
3912allowing the '``or``' to be folded to -1.
3913
3914.. code-block:: llvm
3915
3916      %A = select undef, %X, %Y
3917      %B = select undef, 42, %Y
3918      %C = select %X, %Y, undef
3919    Safe:
3920      %A = %X     (or %Y)
3921      %B = 42     (or %Y)
3922      %C = %Y
3923    Unsafe:
3924      %A = undef
3925      %B = undef
3926      %C = undef
3927
3928This set of examples shows that undefined '``select``' (and conditional
3929branch) conditions can go *either way*, but they have to come from one
3930of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3931both known to have a clear low bit, then ``%A`` would have to have a
3932cleared low bit. However, in the ``%C`` example, the optimizer is
3933allowed to assume that the '``undef``' operand could be the same as
3934``%Y``, allowing the whole '``select``' to be eliminated.
3935
3936.. code-block:: llvm
3937
3938      %A = xor undef, undef
3939
3940      %B = undef
3941      %C = xor %B, %B
3942
3943      %D = undef
3944      %E = icmp slt %D, 4
3945      %F = icmp gte %D, 4
3946
3947    Safe:
3948      %A = undef
3949      %B = undef
3950      %C = undef
3951      %D = undef
3952      %E = undef
3953      %F = undef
3954
3955This example points out that two '``undef``' operands are not
3956necessarily the same. This can be surprising to people (and also matches
3957C semantics) where they assume that "``X^X``" is always zero, even if
3958``X`` is undefined. This isn't true for a number of reasons, but the
3959short answer is that an '``undef``' "variable" can arbitrarily change
3960its value over its "live range". This is true because the variable
3961doesn't actually *have a live range*. Instead, the value is logically
3962read from arbitrary registers that happen to be around when needed, so
3963the value is not necessarily consistent over time. In fact, ``%A`` and
3964``%C`` need to have the same semantics or the core LLVM "replace all
3965uses with" concept would not hold.
3966
3967To ensure all uses of a given register observe the same value (even if
3968'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
3969
3970.. code-block:: llvm
3971
3972      %A = sdiv undef, %X
3973      %B = sdiv %X, undef
3974    Safe:
3975      %A = 0
3976    b: unreachable
3977
3978These examples show the crucial difference between an *undefined value*
3979and *undefined behavior*. An undefined value (like '``undef``') is
3980allowed to have an arbitrary bit-pattern. This means that the ``%A``
3981operation can be constant folded to '``0``', because the '``undef``'
3982could be zero, and zero divided by any value is zero.
3983However, in the second example, we can make a more aggressive
3984assumption: because the ``undef`` is allowed to be an arbitrary value,
3985we are allowed to assume that it could be zero. Since a divide by zero
3986has *undefined behavior*, we are allowed to assume that the operation
3987does not execute at all. This allows us to delete the divide and all
3988code after it. Because the undefined operation "can't happen", the
3989optimizer can assume that it occurs in dead code.
3990
3991.. code-block:: text
3992
3993    a:  store undef -> %X
3994    b:  store %X -> undef
3995    Safe:
3996    a: <deleted>
3997    b: unreachable
3998
3999A store *of* an undefined value can be assumed to not have any effect;
4000we can assume that the value is overwritten with bits that happen to
4001match what was already there. However, a store *to* an undefined
4002location could clobber arbitrary memory, therefore, it has undefined
4003behavior.
4004
4005Branching on an undefined value is undefined behavior.
4006This explains optimizations that depend on branch conditions to construct
4007predicates, such as Correlated Value Propagation and Global Value Numbering.
4008In case of switch instruction, the branch condition should be frozen, otherwise
4009it is undefined behavior.
4010
4011.. code-block:: llvm
4012
4013    Unsafe:
4014      br undef, BB1, BB2 ; UB
4015
4016      %X = and i32 undef, 255
4017      switch %X, label %ret [ .. ] ; UB
4018
4019      store undef, i8* %ptr
4020      %X = load i8* %ptr ; %X is undef
4021      switch i8 %X, label %ret [ .. ] ; UB
4022
4023    Safe:
4024      %X = or i8 undef, 255 ; always 255
4025      switch i8 %X, label %ret [ .. ] ; Well-defined
4026
4027      %X = freeze i1 undef
4028      br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4029
4030
4031This is also consistent with the behavior of MemorySanitizer.
4032MemorySanitizer, detector of uses of uninitialized memory,
4033defines a branch with condition that depends on an undef value (or
4034certain other values, like e.g. a result of a load from heap-allocated
4035memory that has never been stored to) to have an externally visible
4036side effect. For this reason functions with *sanitize_memory*
4037attribute are not allowed to produce such branches "out of thin
4038air". More strictly, an optimization that inserts a conditional branch
4039is only valid if in all executions where the branch condition has at
4040least one undefined bit, the same branch condition is evaluated in the
4041input IR as well.
4042
4043.. _poisonvalues:
4044
4045Poison Values
4046-------------
4047
4048A poison value is a result of an erroneous operation.
4049In order to facilitate speculative execution, many instructions do not
4050invoke immediate undefined behavior when provided with illegal operands,
4051and return a poison value instead.
4052The string '``poison``' can be used anywhere a constant is expected, and
4053operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4054a poison value.
4055
4056Poison value behavior is defined in terms of value *dependence*:
4057
4058-  Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and
4059   :ref:`freeze <i_freeze>` instructions depend on their operands.
4060-  :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
4061   their dynamic predecessor basic block.
4062-  :ref:`Select <i_select>` instructions depend on their condition operand and
4063   their selected operand.
4064-  Function arguments depend on the corresponding actual argument values
4065   in the dynamic callers of their functions.
4066-  :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
4067   instructions that dynamically transfer control back to them.
4068-  :ref:`Invoke <i_invoke>` instructions depend on the
4069   :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
4070   call instructions that dynamically transfer control back to them.
4071-  Non-volatile loads and stores depend on the most recent stores to all
4072   of the referenced memory addresses, following the order in the IR
4073   (including loads and stores implied by intrinsics such as
4074   :ref:`@llvm.memcpy <int_memcpy>`.)
4075-  An instruction with externally visible side effects depends on the
4076   most recent preceding instruction with externally visible side
4077   effects, following the order in the IR. (This includes :ref:`volatile
4078   operations <volatile>`.)
4079-  An instruction *control-depends* on a :ref:`terminator
4080   instruction <terminators>` if the terminator instruction has
4081   multiple successors and the instruction is always executed when
4082   control transfers to one of the successors, and may not be executed
4083   when control is transferred to another.
4084-  Additionally, an instruction also *control-depends* on a terminator
4085   instruction if the set of instructions it otherwise depends on would
4086   be different if the terminator had transferred control to a different
4087   successor.
4088-  Dependence is transitive.
4089-  Vector elements may be independently poisoned. Therefore, transforms
4090   on instructions such as shufflevector must be careful to propagate
4091   poison across values or elements only as allowed by the original code.
4092
4093An instruction that *depends* on a poison value, produces a poison value
4094itself. A poison value may be relaxed into an
4095:ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern.
4096Propagation of poison can be stopped with the
4097:ref:`freeze instruction <i_freeze>`.
4098
4099This means that immediate undefined behavior occurs if a poison value is
4100used as an instruction operand that has any values that trigger undefined
4101behavior. Notably this includes (but is not limited to):
4102
4103-  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4104   any other pointer dereferencing instruction (independent of address
4105   space).
4106-  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4107   instruction.
4108-  The condition operand of a :ref:`br <i_br>` instruction.
4109-  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4110   instruction.
4111-  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4112   instruction, when the function or invoking call site has a ``noundef``
4113   attribute in the corresponding position.
4114-  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4115   call site has a `noundef` attribute in the return value position.
4116
4117Here are some examples:
4118
4119.. code-block:: llvm
4120
4121    entry:
4122      %poison = sub nuw i32 0, 1           ; Results in a poison value.
4123      %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4124      %still_poison = and i32 %poison, 0   ; 0, but also poison.
4125      %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
4126      store i32 0, i32* %poison_yet_again  ; Undefined behavior due to
4127                                           ; store to poison.
4128
4129      store i32 %poison, i32* @g           ; Poison value stored to memory.
4130      %poison3 = load i32, i32* @g         ; Poison value loaded back from memory.
4131
4132      %narrowaddr = bitcast i32* @g to i16*
4133      %wideaddr = bitcast i32* @g to i64*
4134      %poison4 = load i16, i16* %narrowaddr ; Returns a poison value.
4135      %poison5 = load i64, i64* %wideaddr   ; Returns a poison value.
4136
4137      %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4138      br i1 %cmp, label %end, label %end   ; undefined behavior
4139
4140    end:
4141
4142.. _welldefinedvalues:
4143
4144Well-Defined Values
4145-------------------
4146
4147Given a program execution, a value is *well defined* if the value does not
4148have an undef bit and is not poison in the execution.
4149An aggregate value or vector is well defined if its elements are well defined.
4150The padding of an aggregate isn't considered, since it isn't visible
4151without storing it into memory and loading it with a different type.
4152
4153A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4154defined if it is neither '``undef``' constant nor '``poison``' constant.
4155The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4156of its operand.
4157
4158.. _blockaddress:
4159
4160Addresses of Basic Blocks
4161-------------------------
4162
4163``blockaddress(@function, %block)``
4164
4165The '``blockaddress``' constant computes the address of the specified
4166basic block in the specified function.
4167
4168It always has an ``i8 addrspace(P)*`` type, where ``P`` is the address space
4169of the function containing ``%block`` (usually ``addrspace(0)``).
4170
4171Taking the address of the entry block is illegal.
4172
4173This value only has defined behavior when used as an operand to the
4174':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
4175for comparisons against null. Pointer equality tests between labels addresses
4176results in undefined behavior --- though, again, comparison against null is ok,
4177and no label is equal to the null pointer. This may be passed around as an
4178opaque pointer sized value as long as the bits are not inspected. This
4179allows ``ptrtoint`` and arithmetic to be performed on these values so
4180long as the original value is reconstituted before the ``indirectbr`` or
4181``callbr`` instruction.
4182
4183Finally, some targets may provide defined semantics when using the value
4184as the operand to an inline assembly, but that is target specific.
4185
4186.. _dso_local_equivalent:
4187
4188DSO Local Equivalent
4189--------------------
4190
4191``dso_local_equivalent @func``
4192
4193A '``dso_local_equivalent``' constant represents a function which is
4194functionally equivalent to a given function, but is always defined in the
4195current linkage unit. The resulting pointer has the same type as the underlying
4196function. The resulting pointer is permitted, but not required, to be different
4197from a pointer to the function, and it may have different values in different
4198translation units.
4199
4200The target function may not have ``extern_weak`` linkage.
4201
4202``dso_local_equivalent`` can be implemented as such:
4203
4204- If the function has local linkage, hidden visibility, or is
4205  ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4206  to the function.
4207- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4208  function. Many targets support relocations that resolve at link time to either
4209  a function or a stub for it, depending on if the function is defined within the
4210  linkage unit; LLVM will use this when available. (This is commonly called a
4211  "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4212
4213This can be used wherever a ``dso_local`` instance of a function is needed without
4214needing to explicitly make the original function ``dso_local``. An instance where
4215this can be used is for static offset calculations between a function and some other
4216``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4217where dynamic relocations for function pointers in VTables can be replaced with
4218static relocations for offsets between the VTable and virtual functions which
4219may not be ``dso_local``.
4220
4221This is currently only supported for ELF binary formats.
4222
4223.. _constantexprs:
4224
4225Constant Expressions
4226--------------------
4227
4228Constant expressions are used to allow expressions involving other
4229constants to be used as constants. Constant expressions may be of any
4230:ref:`first class <t_firstclass>` type and may involve any LLVM operation
4231that does not have side effects (e.g. load and call are not supported).
4232The following is the syntax for constant expressions:
4233
4234``trunc (CST to TYPE)``
4235    Perform the :ref:`trunc operation <i_trunc>` on constants.
4236``zext (CST to TYPE)``
4237    Perform the :ref:`zext operation <i_zext>` on constants.
4238``sext (CST to TYPE)``
4239    Perform the :ref:`sext operation <i_sext>` on constants.
4240``fptrunc (CST to TYPE)``
4241    Truncate a floating-point constant to another floating-point type.
4242    The size of CST must be larger than the size of TYPE. Both types
4243    must be floating-point.
4244``fpext (CST to TYPE)``
4245    Floating-point extend a constant to another type. The size of CST
4246    must be smaller or equal to the size of TYPE. Both types must be
4247    floating-point.
4248``fptoui (CST to TYPE)``
4249    Convert a floating-point constant to the corresponding unsigned
4250    integer constant. TYPE must be a scalar or vector integer type. CST
4251    must be of scalar or vector floating-point type. Both CST and TYPE
4252    must be scalars, or vectors of the same number of elements. If the
4253    value won't fit in the integer type, the result is a
4254    :ref:`poison value <poisonvalues>`.
4255``fptosi (CST to TYPE)``
4256    Convert a floating-point constant to the corresponding signed
4257    integer constant. TYPE must be a scalar or vector integer type. CST
4258    must be of scalar or vector floating-point type. Both CST and TYPE
4259    must be scalars, or vectors of the same number of elements. If the
4260    value won't fit in the integer type, the result is a
4261    :ref:`poison value <poisonvalues>`.
4262``uitofp (CST to TYPE)``
4263    Convert an unsigned integer constant to the corresponding
4264    floating-point constant. TYPE must be a scalar or vector floating-point
4265    type.  CST must be of scalar or vector integer type. Both CST and TYPE must
4266    be scalars, or vectors of the same number of elements.
4267``sitofp (CST to TYPE)``
4268    Convert a signed integer constant to the corresponding floating-point
4269    constant. TYPE must be a scalar or vector floating-point type.
4270    CST must be of scalar or vector integer type. Both CST and TYPE must
4271    be scalars, or vectors of the same number of elements.
4272``ptrtoint (CST to TYPE)``
4273    Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4274``inttoptr (CST to TYPE)``
4275    Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4276    This one is *really* dangerous!
4277``bitcast (CST to TYPE)``
4278    Convert a constant, CST, to another TYPE.
4279    The constraints of the operands are the same as those for the
4280    :ref:`bitcast instruction <i_bitcast>`.
4281``addrspacecast (CST to TYPE)``
4282    Convert a constant pointer or constant vector of pointer, CST, to another
4283    TYPE in a different address space. The constraints of the operands are the
4284    same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4285``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4286    Perform the :ref:`getelementptr operation <i_getelementptr>` on
4287    constants. As with the :ref:`getelementptr <i_getelementptr>`
4288    instruction, the index list may have one or more indexes, which are
4289    required to make sense for the type of "pointer to TY".
4290``select (COND, VAL1, VAL2)``
4291    Perform the :ref:`select operation <i_select>` on constants.
4292``icmp COND (VAL1, VAL2)``
4293    Perform the :ref:`icmp operation <i_icmp>` on constants.
4294``fcmp COND (VAL1, VAL2)``
4295    Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4296``extractelement (VAL, IDX)``
4297    Perform the :ref:`extractelement operation <i_extractelement>` on
4298    constants.
4299``insertelement (VAL, ELT, IDX)``
4300    Perform the :ref:`insertelement operation <i_insertelement>` on
4301    constants.
4302``shufflevector (VEC1, VEC2, IDXMASK)``
4303    Perform the :ref:`shufflevector operation <i_shufflevector>` on
4304    constants.
4305``extractvalue (VAL, IDX0, IDX1, ...)``
4306    Perform the :ref:`extractvalue operation <i_extractvalue>` on
4307    constants. The index list is interpreted in a similar manner as
4308    indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
4309    least one index value must be specified.
4310``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
4311    Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
4312    The index list is interpreted in a similar manner as indices in a
4313    ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
4314    value must be specified.
4315``OPCODE (LHS, RHS)``
4316    Perform the specified operation of the LHS and RHS constants. OPCODE
4317    may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
4318    binary <bitwiseops>` operations. The constraints on operands are
4319    the same as those for the corresponding instruction (e.g. no bitwise
4320    operations on floating-point values are allowed).
4321
4322Other Values
4323============
4324
4325.. _inlineasmexprs:
4326
4327Inline Assembler Expressions
4328----------------------------
4329
4330LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4331Inline Assembly <moduleasm>`) through the use of a special value. This value
4332represents the inline assembler as a template string (containing the
4333instructions to emit), a list of operand constraints (stored as a string), a
4334flag that indicates whether or not the inline asm expression has side effects,
4335and a flag indicating whether the function containing the asm needs to align its
4336stack conservatively.
4337
4338The template string supports argument substitution of the operands using "``$``"
4339followed by a number, to indicate substitution of the given register/memory
4340location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4341be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4342operand (See :ref:`inline-asm-modifiers`).
4343
4344A literal "``$``" may be included by using "``$$``" in the template. To include
4345other special characters into the output, the usual "``\XX``" escapes may be
4346used, just as in other strings. Note that after template substitution, the
4347resulting assembly string is parsed by LLVM's integrated assembler unless it is
4348disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4349syntax known to LLVM.
4350
4351LLVM also supports a few more substitutions useful for writing inline assembly:
4352
4353- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4354  This substitution is useful when declaring a local label. Many standard
4355  compiler optimizations, such as inlining, may duplicate an inline asm blob.
4356  Adding a blob-unique identifier ensures that the two labels will not conflict
4357  during assembly. This is used to implement `GCC's %= special format
4358  string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4359- ``${:comment}``: Expands to the comment character of the current target's
4360  assembly dialect. This is usually ``#``, but many targets use other strings,
4361  such as ``;``, ``//``, or ``!``.
4362- ``${:private}``: Expands to the assembler private label prefix. Labels with
4363  this prefix will not appear in the symbol table of the assembled object.
4364  Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4365  relatively popular.
4366
4367LLVM's support for inline asm is modeled closely on the requirements of Clang's
4368GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4369modifier codes listed here are similar or identical to those in GCC's inline asm
4370support. However, to be clear, the syntax of the template and constraint strings
4371described here is *not* the same as the syntax accepted by GCC and Clang, and,
4372while most constraint letters are passed through as-is by Clang, some get
4373translated to other codes when converting from the C source to the LLVM
4374assembly.
4375
4376An example inline assembler expression is:
4377
4378.. code-block:: llvm
4379
4380    i32 (i32) asm "bswap $0", "=r,r"
4381
4382Inline assembler expressions may **only** be used as the callee operand
4383of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4384Thus, typically we have:
4385
4386.. code-block:: llvm
4387
4388    %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4389
4390Inline asms with side effects not visible in the constraint list must be
4391marked as having side effects. This is done through the use of the
4392'``sideeffect``' keyword, like so:
4393
4394.. code-block:: llvm
4395
4396    call void asm sideeffect "eieio", ""()
4397
4398In some cases inline asms will contain code that will not work unless
4399the stack is aligned in some way, such as calls or SSE instructions on
4400x86, yet will not contain code that does that alignment within the asm.
4401The compiler should make conservative assumptions about what the asm
4402might contain and should generate its usual stack alignment code in the
4403prologue if the '``alignstack``' keyword is present:
4404
4405.. code-block:: llvm
4406
4407    call void asm alignstack "eieio", ""()
4408
4409Inline asms also support using non-standard assembly dialects. The
4410assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4411the inline asm is using the Intel dialect. Currently, ATT and Intel are
4412the only supported dialects. An example is:
4413
4414.. code-block:: llvm
4415
4416    call void asm inteldialect "eieio", ""()
4417
4418In the case that the inline asm might unwind the stack,
4419the '``unwind``' keyword must be used, so that the compiler emits
4420unwinding information:
4421
4422.. code-block:: llvm
4423
4424    call void asm unwind "call func", ""()
4425
4426If the inline asm unwinds the stack and isn't marked with
4427the '``unwind``' keyword, the behavior is undefined.
4428
4429If multiple keywords appear, the '``sideeffect``' keyword must come
4430first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4431third and the '``unwind``' keyword last.
4432
4433Inline Asm Constraint String
4434^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4435
4436The constraint list is a comma-separated string, each element containing one or
4437more constraint codes.
4438
4439For each element in the constraint list an appropriate register or memory
4440operand will be chosen, and it will be made available to assembly template
4441string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4442second, etc.
4443
4444There are three different types of constraints, which are distinguished by a
4445prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4446constraints must always be given in that order: outputs first, then inputs, then
4447clobbers. They cannot be intermingled.
4448
4449There are also three different categories of constraint codes:
4450
4451- Register constraint. This is either a register class, or a fixed physical
4452  register. This kind of constraint will allocate a register, and if necessary,
4453  bitcast the argument or result to the appropriate type.
4454- Memory constraint. This kind of constraint is for use with an instruction
4455  taking a memory operand. Different constraints allow for different addressing
4456  modes used by the target.
4457- Immediate value constraint. This kind of constraint is for an integer or other
4458  immediate value which can be rendered directly into an instruction. The
4459  various target-specific constraints allow the selection of a value in the
4460  proper range for the instruction you wish to use it with.
4461
4462Output constraints
4463""""""""""""""""""
4464
4465Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4466indicates that the assembly will write to this operand, and the operand will
4467then be made available as a return value of the ``asm`` expression. Output
4468constraints do not consume an argument from the call instruction. (Except, see
4469below about indirect outputs).
4470
4471Normally, it is expected that no output locations are written to by the assembly
4472expression until *all* of the inputs have been read. As such, LLVM may assign
4473the same register to an output and an input. If this is not safe (e.g. if the
4474assembly contains two instructions, where the first writes to one output, and
4475the second reads an input and writes to a second output), then the "``&``"
4476modifier must be used (e.g. "``=&r``") to specify that the output is an
4477"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4478will not use the same register for any inputs (other than an input tied to this
4479output).
4480
4481Input constraints
4482"""""""""""""""""
4483
4484Input constraints do not have a prefix -- just the constraint codes. Each input
4485constraint will consume one argument from the call instruction. It is not
4486permitted for the asm to write to any input register or memory location (unless
4487that input is tied to an output). Note also that multiple inputs may all be
4488assigned to the same register, if LLVM can determine that they necessarily all
4489contain the same value.
4490
4491Instead of providing a Constraint Code, input constraints may also "tie"
4492themselves to an output constraint, by providing an integer as the constraint
4493string. Tied inputs still consume an argument from the call instruction, and
4494take up a position in the asm template numbering as is usual -- they will simply
4495be constrained to always use the same register as the output they've been tied
4496to. For example, a constraint string of "``=r,0``" says to assign a register for
4497output, and use that register as an input as well (it being the 0'th
4498constraint).
4499
4500It is permitted to tie an input to an "early-clobber" output. In that case, no
4501*other* input may share the same register as the input tied to the early-clobber
4502(even when the other input has the same value).
4503
4504You may only tie an input to an output which has a register constraint, not a
4505memory constraint. Only a single input may be tied to an output.
4506
4507There is also an "interesting" feature which deserves a bit of explanation: if a
4508register class constraint allocates a register which is too small for the value
4509type operand provided as input, the input value will be split into multiple
4510registers, and all of them passed to the inline asm.
4511
4512However, this feature is often not as useful as you might think.
4513
4514Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4515architectures that have instructions which operate on multiple consecutive
4516instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4517SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4518hardware then loads into both the named register, and the next register. This
4519feature of inline asm would not be useful to support that.)
4520
4521A few of the targets provide a template string modifier allowing explicit access
4522to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4523``D``). On such an architecture, you can actually access the second allocated
4524register (yet, still, not any subsequent ones). But, in that case, you're still
4525probably better off simply splitting the value into two separate operands, for
4526clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4527despite existing only for use with this feature, is not really a good idea to
4528use)
4529
4530Indirect inputs and outputs
4531"""""""""""""""""""""""""""
4532
4533Indirect output or input constraints can be specified by the "``*``" modifier
4534(which goes after the "``=``" in case of an output). This indicates that the asm
4535will write to or read from the contents of an *address* provided as an input
4536argument. (Note that in this way, indirect outputs act more like an *input* than
4537an output: just like an input, they consume an argument of the call expression,
4538rather than producing a return value. An indirect output constraint is an
4539"output" only in that the asm is expected to write to the contents of the input
4540memory location, instead of just read from it).
4541
4542This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4543address of a variable as a value.
4544
4545It is also possible to use an indirect *register* constraint, but only on output
4546(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4547value normally, and then, separately emit a store to the address provided as
4548input, after the provided inline asm. (It's not clear what value this
4549functionality provides, compared to writing the store explicitly after the asm
4550statement, and it can only produce worse code, since it bypasses many
4551optimization passes. I would recommend not using it.)
4552
4553
4554Clobber constraints
4555"""""""""""""""""""
4556
4557A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4558consume an input operand, nor generate an output. Clobbers cannot use any of the
4559general constraint code letters -- they may use only explicit register
4560constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4561"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4562memory locations -- not only the memory pointed to by a declared indirect
4563output.
4564
4565Note that clobbering named registers that are also present in output
4566constraints is not legal.
4567
4568
4569Constraint Codes
4570""""""""""""""""
4571After a potential prefix comes constraint code, or codes.
4572
4573A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4574followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4575(e.g. "``{eax}``").
4576
4577The one and two letter constraint codes are typically chosen to be the same as
4578GCC's constraint codes.
4579
4580A single constraint may include one or more than constraint code in it, leaving
4581it up to LLVM to choose which one to use. This is included mainly for
4582compatibility with the translation of GCC inline asm coming from clang.
4583
4584There are two ways to specify alternatives, and either or both may be used in an
4585inline asm constraint list:
4586
45871) Append the codes to each other, making a constraint code set. E.g. "``im``"
4588   or "``{eax}m``". This means "choose any of the options in the set". The
4589   choice of constraint is made independently for each constraint in the
4590   constraint list.
4591
45922) Use "``|``" between constraint code sets, creating alternatives. Every
4593   constraint in the constraint list must have the same number of alternative
4594   sets. With this syntax, the same alternative in *all* of the items in the
4595   constraint list will be chosen together.
4596
4597Putting those together, you might have a two operand constraint string like
4598``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4599operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4600may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4601
4602However, the use of either of the alternatives features is *NOT* recommended, as
4603LLVM is not able to make an intelligent choice about which one to use. (At the
4604point it currently needs to choose, not enough information is available to do so
4605in a smart way.) Thus, it simply tries to make a choice that's most likely to
4606compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4607always choose to use memory, not registers). And, if given multiple registers,
4608or multiple register classes, it will simply choose the first one. (In fact, it
4609doesn't currently even ensure explicitly specified physical registers are
4610unique, so specifying multiple physical registers as alternatives, like
4611``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4612intended.)
4613
4614Supported Constraint Code List
4615""""""""""""""""""""""""""""""
4616
4617The constraint codes are, in general, expected to behave the same way they do in
4618GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4619inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4620and GCC likely indicates a bug in LLVM.
4621
4622Some constraint codes are typically supported by all targets:
4623
4624- ``r``: A register in the target's general purpose register class.
4625- ``m``: A memory address operand. It is target-specific what addressing modes
4626  are supported, typical examples are register, or register + register offset,
4627  or register + immediate offset (of some target-specific size).
4628- ``i``: An integer constant (of target-specific width). Allows either a simple
4629  immediate, or a relocatable value.
4630- ``n``: An integer constant -- *not* including relocatable values.
4631- ``s``: An integer constant, but allowing *only* relocatable values.
4632- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4633  useful to pass a label for an asm branch or call.
4634
4635  .. FIXME: but that surely isn't actually okay to jump out of an asm
4636     block without telling llvm about the control transfer???)
4637
4638- ``{register-name}``: Requires exactly the named physical register.
4639
4640Other constraints are target-specific:
4641
4642AArch64:
4643
4644- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4645- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4646  i.e. 0 to 4095 with optional shift by 12.
4647- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4648  ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4649- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4650  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4651- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4652  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4653- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4654  32-bit register. This is a superset of ``K``: in addition to the bitmask
4655  immediate, also allows immediate integers which can be loaded with a single
4656  ``MOVZ`` or ``MOVL`` instruction.
4657- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4658  64-bit register. This is a superset of ``L``.
4659- ``Q``: Memory address operand must be in a single register (no
4660  offsets). (However, LLVM currently does this for the ``m`` constraint as
4661  well.)
4662- ``r``: A 32 or 64-bit integer register (W* or X*).
4663- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4664- ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4665- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4666- ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4667- ``Upa``: Any of the SVE predicate registers (P0 to P15)
4668
4669AMDGPU:
4670
4671- ``r``: A 32 or 64-bit integer register.
4672- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4673- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4674- ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4675- ``I``: An integer inline constant in the range from -16 to 64.
4676- ``J``: A 16-bit signed integer constant.
4677- ``A``: An integer or a floating-point inline constant.
4678- ``B``: A 32-bit signed integer constant.
4679- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4680- ``DA``: A 64-bit constant that can be split into two "A" constants.
4681- ``DB``: A 64-bit constant that can be split into two "B" constants.
4682
4683All ARM modes:
4684
4685- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
4686  operand. Treated the same as operand ``m``, at the moment.
4687- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
4688- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
4689
4690ARM and ARM's Thumb2 mode:
4691
4692- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
4693- ``I``: An immediate integer valid for a data-processing instruction.
4694- ``J``: An immediate integer between -4095 and 4095.
4695- ``K``: An immediate integer whose bitwise inverse is valid for a
4696  data-processing instruction. (Can be used with template modifier "``B``" to
4697  print the inverted value).
4698- ``L``: An immediate integer whose negation is valid for a data-processing
4699  instruction. (Can be used with template modifier "``n``" to print the negated
4700  value).
4701- ``M``: A power of two or an integer between 0 and 32.
4702- ``N``: Invalid immediate constraint.
4703- ``O``: Invalid immediate constraint.
4704- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
4705- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
4706  as ``r``.
4707- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
4708  invalid.
4709- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4710  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4711- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4712  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4713- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4714  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4715
4716ARM's Thumb1 mode:
4717
4718- ``I``: An immediate integer between 0 and 255.
4719- ``J``: An immediate integer between -255 and -1.
4720- ``K``: An immediate integer between 0 and 255, with optional left-shift by
4721  some amount.
4722- ``L``: An immediate integer between -7 and 7.
4723- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
4724- ``N``: An immediate integer between 0 and 31.
4725- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
4726- ``r``: A low 32-bit GPR register (``r0-r7``).
4727- ``l``: A low 32-bit GPR register (``r0-r7``).
4728- ``h``: A high GPR register (``r0-r7``).
4729- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4730  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4731- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4732  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4733- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4734  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4735
4736
4737Hexagon:
4738
4739- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
4740  at the moment.
4741- ``r``: A 32 or 64-bit register.
4742
4743MSP430:
4744
4745- ``r``: An 8 or 16-bit register.
4746
4747MIPS:
4748
4749- ``I``: An immediate signed 16-bit integer.
4750- ``J``: An immediate integer zero.
4751- ``K``: An immediate unsigned 16-bit integer.
4752- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
4753- ``N``: An immediate integer between -65535 and -1.
4754- ``O``: An immediate signed 15-bit integer.
4755- ``P``: An immediate integer between 1 and 65535.
4756- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
4757  register plus 16-bit immediate offset. In MIPS mode, just a base register.
4758- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
4759  register plus a 9-bit signed offset. In MIPS mode, the same as constraint
4760  ``m``.
4761- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
4762  ``sc`` instruction on the given subtarget (details vary).
4763- ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
4764- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
4765  (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
4766  argument modifier for compatibility with GCC.
4767- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
4768  ``25``).
4769- ``l``: The ``lo`` register, 32 or 64-bit.
4770- ``x``: Invalid.
4771
4772NVPTX:
4773
4774- ``b``: A 1-bit integer register.
4775- ``c`` or ``h``: A 16-bit integer register.
4776- ``r``: A 32-bit integer register.
4777- ``l`` or ``N``: A 64-bit integer register.
4778- ``f``: A 32-bit float register.
4779- ``d``: A 64-bit float register.
4780
4781
4782PowerPC:
4783
4784- ``I``: An immediate signed 16-bit integer.
4785- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
4786- ``K``: An immediate unsigned 16-bit integer.
4787- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
4788- ``M``: An immediate integer greater than 31.
4789- ``N``: An immediate integer that is an exact power of 2.
4790- ``O``: The immediate integer constant 0.
4791- ``P``: An immediate integer constant whose negation is a signed 16-bit
4792  constant.
4793- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
4794  treated the same as ``m``.
4795- ``r``: A 32 or 64-bit integer register.
4796- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
4797  ``R1-R31``).
4798- ``f``: A 32 or 64-bit float register (``F0-F31``),
4799- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
4800   register (``V0-V31``).
4801
4802- ``y``: Condition register (``CR0-CR7``).
4803- ``wc``: An individual CR bit in a CR register.
4804- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
4805  register set (overlapping both the floating-point and vector register files).
4806- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
4807  set.
4808
4809RISC-V:
4810
4811- ``A``: An address operand (using a general-purpose register, without an
4812  offset).
4813- ``I``: A 12-bit signed integer immediate operand.
4814- ``J``: A zero integer immediate operand.
4815- ``K``: A 5-bit unsigned integer immediate operand.
4816- ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
4817- ``r``: A 32- or 64-bit general-purpose register (depending on the platform
4818  ``XLEN``).
4819- ``vr``: A vector register. (requires V extension).
4820- ``vm``: A vector mask register. (requires V extension).
4821
4822Sparc:
4823
4824- ``I``: An immediate 13-bit signed integer.
4825- ``r``: A 32-bit integer register.
4826- ``f``: Any floating-point register on SparcV8, or a floating-point
4827  register in the "low" half of the registers on SparcV9.
4828- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
4829
4830SystemZ:
4831
4832- ``I``: An immediate unsigned 8-bit integer.
4833- ``J``: An immediate unsigned 12-bit integer.
4834- ``K``: An immediate signed 16-bit integer.
4835- ``L``: An immediate signed 20-bit integer.
4836- ``M``: An immediate integer 0x7fffffff.
4837- ``Q``: A memory address operand with a base address and a 12-bit immediate
4838  unsigned displacement.
4839- ``R``: A memory address operand with a base address, a 12-bit immediate
4840  unsigned displacement, and an index register.
4841- ``S``: A memory address operand with a base address and a 20-bit immediate
4842  signed displacement.
4843- ``T``: A memory address operand with a base address, a 20-bit immediate
4844  signed displacement, and an index register.
4845- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
4846- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
4847  address context evaluates as zero).
4848- ``h``: A 32-bit value in the high part of a 64bit data register
4849  (LLVM-specific)
4850- ``f``: A 32, 64, or 128-bit floating-point register.
4851
4852X86:
4853
4854- ``I``: An immediate integer between 0 and 31.
4855- ``J``: An immediate integer between 0 and 64.
4856- ``K``: An immediate signed 8-bit integer.
4857- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
4858  0xffffffff.
4859- ``M``: An immediate integer between 0 and 3.
4860- ``N``: An immediate unsigned 8-bit integer.
4861- ``O``: An immediate integer between 0 and 127.
4862- ``e``: An immediate 32-bit signed integer.
4863- ``Z``: An immediate 32-bit unsigned integer.
4864- ``o``, ``v``: Treated the same as ``m``, at the moment.
4865- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4866  ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
4867  registers, and on X86-64, it is all of the integer registers.
4868- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4869  ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
4870- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
4871- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
4872  existed since i386, and can be accessed without the REX prefix.
4873- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
4874- ``y``: A 64-bit MMX register, if MMX is enabled.
4875- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
4876  operand in a SSE register. If AVX is also enabled, can also be a 256-bit
4877  vector operand in an AVX register. If AVX-512 is also enabled, can also be a
4878  512-bit vector operand in an AVX512 register, Otherwise, an error.
4879- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
4880- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
4881  32-bit mode, a 64-bit integer operand will get split into two registers). It
4882  is not recommended to use this constraint, as in 64-bit mode, the 64-bit
4883  operand will get allocated only to RAX -- if two 32-bit operands are needed,
4884  you're better off splitting it yourself, before passing it to the asm
4885  statement.
4886
4887XCore:
4888
4889- ``r``: A 32-bit integer register.
4890
4891
4892.. _inline-asm-modifiers:
4893
4894Asm template argument modifiers
4895^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4896
4897In the asm template string, modifiers can be used on the operand reference, like
4898"``${0:n}``".
4899
4900The modifiers are, in general, expected to behave the same way they do in
4901GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4902inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4903and GCC likely indicates a bug in LLVM.
4904
4905Target-independent:
4906
4907- ``c``: Print an immediate integer constant unadorned, without
4908  the target-specific immediate punctuation (e.g. no ``$`` prefix).
4909- ``n``: Negate and print immediate integer constant unadorned, without the
4910  target-specific immediate punctuation (e.g. no ``$`` prefix).
4911- ``l``: Print as an unadorned label, without the target-specific label
4912  punctuation (e.g. no ``$`` prefix).
4913
4914AArch64:
4915
4916- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
4917  instead of ``x30``, print ``w30``.
4918- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
4919- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
4920  ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
4921  ``v*``.
4922
4923AMDGPU:
4924
4925- ``r``: No effect.
4926
4927ARM:
4928
4929- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
4930  register).
4931- ``P``: No effect.
4932- ``q``: No effect.
4933- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
4934  as ``d4[1]`` instead of ``s9``)
4935- ``B``: Bitwise invert and print an immediate integer constant without ``#``
4936  prefix.
4937- ``L``: Print the low 16-bits of an immediate integer constant.
4938- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
4939  register operands subsequent to the specified one (!), so use carefully.
4940- ``Q``: Print the low-order register of a register-pair, or the low-order
4941  register of a two-register operand.
4942- ``R``: Print the high-order register of a register-pair, or the high-order
4943  register of a two-register operand.
4944- ``H``: Print the second register of a register-pair. (On a big-endian system,
4945  ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
4946  to ``R``.)
4947
4948  .. FIXME: H doesn't currently support printing the second register
4949     of a two-register operand.
4950
4951- ``e``: Print the low doubleword register of a NEON quad register.
4952- ``f``: Print the high doubleword register of a NEON quad register.
4953- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
4954  adornment.
4955
4956Hexagon:
4957
4958- ``L``: Print the second register of a two-register operand. Requires that it
4959  has been allocated consecutively to the first.
4960
4961  .. FIXME: why is it restricted to consecutive ones? And there's
4962     nothing that ensures that happens, is there?
4963
4964- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4965  nothing. Used to print 'addi' vs 'add' instructions.
4966
4967MSP430:
4968
4969No additional modifiers.
4970
4971MIPS:
4972
4973- ``X``: Print an immediate integer as hexadecimal
4974- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
4975- ``d``: Print an immediate integer as decimal.
4976- ``m``: Subtract one and print an immediate integer as decimal.
4977- ``z``: Print $0 if an immediate zero, otherwise print normally.
4978- ``L``: Print the low-order register of a two-register operand, or prints the
4979  address of the low-order word of a double-word memory operand.
4980
4981  .. FIXME: L seems to be missing memory operand support.
4982
4983- ``M``: Print the high-order register of a two-register operand, or prints the
4984  address of the high-order word of a double-word memory operand.
4985
4986  .. FIXME: M seems to be missing memory operand support.
4987
4988- ``D``: Print the second register of a two-register operand, or prints the
4989  second word of a double-word memory operand. (On a big-endian system, ``D`` is
4990  equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
4991  ``M``.)
4992- ``w``: No effect. Provided for compatibility with GCC which requires this
4993  modifier in order to print MSA registers (``W0-W31``) with the ``f``
4994  constraint.
4995
4996NVPTX:
4997
4998- ``r``: No effect.
4999
5000PowerPC:
5001
5002- ``L``: Print the second register of a two-register operand. Requires that it
5003  has been allocated consecutively to the first.
5004
5005  .. FIXME: why is it restricted to consecutive ones? And there's
5006     nothing that ensures that happens, is there?
5007
5008- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5009  nothing. Used to print 'addi' vs 'add' instructions.
5010- ``y``: For a memory operand, prints formatter for a two-register X-form
5011  instruction. (Currently always prints ``r0,OPERAND``).
5012- ``U``: Prints 'u' if the memory operand is an update form, and nothing
5013  otherwise. (NOTE: LLVM does not support update form, so this will currently
5014  always print nothing)
5015- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5016  not support indexed form, so this will currently always print nothing)
5017
5018RISC-V:
5019
5020- ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5021  nothing. Used to print 'addi' vs 'add' instructions, etc.
5022- ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5023  normally.
5024
5025Sparc:
5026
5027- ``r``: No effect.
5028
5029SystemZ:
5030
5031SystemZ implements only ``n``, and does *not* support any of the other
5032target-independent modifiers.
5033
5034X86:
5035
5036- ``c``: Print an unadorned integer or symbol name. (The latter is
5037  target-specific behavior for this typically target-independent modifier).
5038- ``A``: Print a register name with a '``*``' before it.
5039- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5040  operand.
5041- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5042  memory operand.
5043- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5044  operand.
5045- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5046  operand.
5047- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5048  available, otherwise the 32-bit register name; do nothing on a memory operand.
5049- ``n``: Negate and print an unadorned integer, or, for operands other than an
5050  immediate integer (e.g. a relocatable symbol expression), print a '-' before
5051  the operand. (The behavior for relocatable symbol expressions is a
5052  target-specific behavior for this typically target-independent modifier)
5053- ``H``: Print a memory reference with additional offset +8.
5054- ``P``: Print a memory reference or operand for use as the argument of a call
5055  instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
5056
5057XCore:
5058
5059No additional modifiers.
5060
5061
5062Inline Asm Metadata
5063^^^^^^^^^^^^^^^^^^^
5064
5065The call instructions that wrap inline asm nodes may have a
5066"``!srcloc``" MDNode attached to it that contains a list of constant
5067integers. If present, the code generator will use the integer as the
5068location cookie value when report errors through the ``LLVMContext``
5069error reporting mechanisms. This allows a front-end to correlate backend
5070errors that occur with inline asm back to the source code that produced
5071it. For example:
5072
5073.. code-block:: llvm
5074
5075    call void asm sideeffect "something bad", ""(), !srcloc !42
5076    ...
5077    !42 = !{ i32 1234567 }
5078
5079It is up to the front-end to make sense of the magic numbers it places
5080in the IR. If the MDNode contains multiple constants, the code generator
5081will use the one that corresponds to the line of the asm that the error
5082occurs on.
5083
5084.. _metadata:
5085
5086Metadata
5087========
5088
5089LLVM IR allows metadata to be attached to instructions and global objects in the
5090program that can convey extra information about the code to the optimizers and
5091code generator. One example application of metadata is source-level
5092debug information. There are two metadata primitives: strings and nodes.
5093
5094Metadata does not have a type, and is not a value. If referenced from a
5095``call`` instruction, it uses the ``metadata`` type.
5096
5097All metadata are identified in syntax by an exclamation point ('``!``').
5098
5099.. _metadata-string:
5100
5101Metadata Nodes and Metadata Strings
5102-----------------------------------
5103
5104A metadata string is a string surrounded by double quotes. It can
5105contain any character by escaping non-printable characters with
5106"``\xx``" where "``xx``" is the two digit hex code. For example:
5107"``!"test\00"``".
5108
5109Metadata nodes are represented with notation similar to structure
5110constants (a comma separated list of elements, surrounded by braces and
5111preceded by an exclamation point). Metadata nodes can have any values as
5112their operand. For example:
5113
5114.. code-block:: llvm
5115
5116    !{ !"test\00", i32 10}
5117
5118Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5119
5120.. code-block:: text
5121
5122    !0 = distinct !{!"test\00", i32 10}
5123
5124``distinct`` nodes are useful when nodes shouldn't be merged based on their
5125content. They can also occur when transformations cause uniquing collisions
5126when metadata operands change.
5127
5128A :ref:`named metadata <namedmetadatastructure>` is a collection of
5129metadata nodes, which can be looked up in the module symbol table. For
5130example:
5131
5132.. code-block:: llvm
5133
5134    !foo = !{!4, !3}
5135
5136Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5137intrinsic is using three metadata arguments:
5138
5139.. code-block:: llvm
5140
5141    call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5142
5143Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5144to the ``add`` instruction using the ``!dbg`` identifier:
5145
5146.. code-block:: llvm
5147
5148    %indvar.next = add i64 %indvar, 1, !dbg !21
5149
5150Instructions may not have multiple metadata attachments with the same
5151identifier.
5152
5153Metadata can also be attached to a function or a global variable. Here metadata
5154``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5155and ``g2`` using the ``!dbg`` identifier:
5156
5157.. code-block:: llvm
5158
5159    declare !dbg !22 void @f1()
5160    define void @f2() !dbg !22 {
5161      ret void
5162    }
5163
5164    @g1 = global i32 0, !dbg !22
5165    @g2 = external global i32, !dbg !22
5166
5167Unlike instructions, global objects (functions and global variables) may have
5168multiple metadata attachments with the same identifier.
5169
5170A transformation is required to drop any metadata attachment that it does not
5171know or know it can't preserve. Currently there is an exception for metadata
5172attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
5173unconditionally dropped unless the global is itself deleted.
5174
5175Metadata attached to a module using named metadata may not be dropped, with
5176the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5177
5178More information about specific metadata nodes recognized by the
5179optimizers and code generator is found below.
5180
5181.. _specialized-metadata:
5182
5183Specialized Metadata Nodes
5184^^^^^^^^^^^^^^^^^^^^^^^^^^
5185
5186Specialized metadata nodes are custom data structures in metadata (as opposed
5187to generic tuples). Their fields are labelled, and can be specified in any
5188order.
5189
5190These aren't inherently debug info centric, but currently all the specialized
5191metadata nodes are related to debug info.
5192
5193.. _DICompileUnit:
5194
5195DICompileUnit
5196"""""""""""""
5197
5198``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5199``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5200containing the debug info to be emitted along with the compile unit, regardless
5201of code optimizations (some nodes are only emitted if there are references to
5202them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5203indicating whether or not line-table discriminators are updated to provide
5204more-accurate debug info for profiling results.
5205
5206.. code-block:: text
5207
5208    !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5209                        isOptimized: true, flags: "-O2", runtimeVersion: 2,
5210                        splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5211                        enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5212                        macros: !6, dwoId: 0x0abcd)
5213
5214Compile unit descriptors provide the root scope for objects declared in a
5215specific compilation unit. File descriptors are defined using this scope.  These
5216descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5217track of global variables, type information, and imported entities (declarations
5218and namespaces).
5219
5220.. _DIFile:
5221
5222DIFile
5223""""""
5224
5225``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5226
5227.. code-block:: none
5228
5229    !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5230                 checksumkind: CSK_MD5,
5231                 checksum: "000102030405060708090a0b0c0d0e0f")
5232
5233Files are sometimes used in ``scope:`` fields, and are the only valid target
5234for ``file:`` fields.
5235Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5236
5237.. _DIBasicType:
5238
5239DIBasicType
5240"""""""""""
5241
5242``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5243``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5244
5245.. code-block:: text
5246
5247    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5248                      encoding: DW_ATE_unsigned_char)
5249    !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5250
5251The ``encoding:`` describes the details of the type. Usually it's one of the
5252following:
5253
5254.. code-block:: text
5255
5256  DW_ATE_address       = 1
5257  DW_ATE_boolean       = 2
5258  DW_ATE_float         = 4
5259  DW_ATE_signed        = 5
5260  DW_ATE_signed_char   = 6
5261  DW_ATE_unsigned      = 7
5262  DW_ATE_unsigned_char = 8
5263
5264.. _DISubroutineType:
5265
5266DISubroutineType
5267""""""""""""""""
5268
5269``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5270refers to a tuple; the first operand is the return type, while the rest are the
5271types of the formal arguments in order. If the first operand is ``null``, that
5272represents a function with no return value (such as ``void foo() {}`` in C++).
5273
5274.. code-block:: text
5275
5276    !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5277    !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5278    !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5279
5280.. _DIDerivedType:
5281
5282DIDerivedType
5283"""""""""""""
5284
5285``DIDerivedType`` nodes represent types derived from other types, such as
5286qualified types.
5287
5288.. code-block:: text
5289
5290    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5291                      encoding: DW_ATE_unsigned_char)
5292    !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5293                        align: 32)
5294
5295The following ``tag:`` values are valid:
5296
5297.. code-block:: text
5298
5299  DW_TAG_member             = 13
5300  DW_TAG_pointer_type       = 15
5301  DW_TAG_reference_type     = 16
5302  DW_TAG_typedef            = 22
5303  DW_TAG_inheritance        = 28
5304  DW_TAG_ptr_to_member_type = 31
5305  DW_TAG_const_type         = 38
5306  DW_TAG_friend             = 42
5307  DW_TAG_volatile_type      = 53
5308  DW_TAG_restrict_type      = 55
5309  DW_TAG_atomic_type        = 71
5310
5311.. _DIDerivedTypeMember:
5312
5313``DW_TAG_member`` is used to define a member of a :ref:`composite type
5314<DICompositeType>`. The type of the member is the ``baseType:``. The
5315``offset:`` is the member's bit offset.  If the composite type has an ODR
5316``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5317uniqued based only on its ``name:`` and ``scope:``.
5318
5319``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5320field of :ref:`composite types <DICompositeType>` to describe parents and
5321friends.
5322
5323``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5324
5325``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5326``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
5327are used to qualify the ``baseType:``.
5328
5329Note that the ``void *`` type is expressed as a type derived from NULL.
5330
5331.. _DICompositeType:
5332
5333DICompositeType
5334"""""""""""""""
5335
5336``DICompositeType`` nodes represent types composed of other types, like
5337structures and unions. ``elements:`` points to a tuple of the composed types.
5338
5339If the source language supports ODR, the ``identifier:`` field gives the unique
5340identifier used for type merging between modules.  When specified,
5341:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5342derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5343``scope:`` change uniquing rules.
5344
5345For a given ``identifier:``, there should only be a single composite type that
5346does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
5347together will unique such definitions at parse time via the ``identifier:``
5348field, even if the nodes are ``distinct``.
5349
5350.. code-block:: text
5351
5352    !0 = !DIEnumerator(name: "SixKind", value: 7)
5353    !1 = !DIEnumerator(name: "SevenKind", value: 7)
5354    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5355    !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5356                          line: 2, size: 32, align: 32, identifier: "_M4Enum",
5357                          elements: !{!0, !1, !2})
5358
5359The following ``tag:`` values are valid:
5360
5361.. code-block:: text
5362
5363  DW_TAG_array_type       = 1
5364  DW_TAG_class_type       = 2
5365  DW_TAG_enumeration_type = 4
5366  DW_TAG_structure_type   = 19
5367  DW_TAG_union_type       = 23
5368
5369For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5370descriptors <DISubrange>`, each representing the range of subscripts at that
5371level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5372array type is a native packed vector. The optional ``dataLocation`` is a
5373DIExpression that describes how to get from an object's address to the actual
5374raw data, if they aren't equivalent. This is only supported for array types,
5375particularly to describe Fortran arrays, which have an array descriptor in
5376addition to the array data. Alternatively it can also be DIVariable which
5377has the address of the actual raw data. The Fortran language supports pointer
5378arrays which can be attached to actual arrays, this attachment between pointer
5379and pointee is called association.  The optional ``associated`` is a
5380DIExpression that describes whether the pointer array is currently associated.
5381The optional ``allocated`` is a DIExpression that describes whether the
5382allocatable array is currently allocated.  The optional ``rank`` is a
5383DIExpression that describes the rank (number of dimensions) of fortran assumed
5384rank array (rank is known at runtime).
5385
5386For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5387descriptors <DIEnumerator>`, each representing the definition of an enumeration
5388value for the set. All enumeration type descriptors are collected in the
5389``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5390
5391For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5392``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5393<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5394``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5395``isDefinition: false``.
5396
5397.. _DISubrange:
5398
5399DISubrange
5400""""""""""
5401
5402``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5403:ref:`DICompositeType`.
5404
5405- ``count: -1`` indicates an empty array.
5406- ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5407- ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5408
5409.. code-block:: text
5410
5411    !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5412    !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5413    !2 = !DISubrange(count: -1) ; empty array.
5414
5415    ; Scopes used in rest of example
5416    !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5417    !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5418    !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5419
5420    ; Use of local variable as count value
5421    !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5422    !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5423    !11 = !DISubrange(count: !10, lowerBound: 0)
5424
5425    ; Use of global variable as count value
5426    !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5427    !13 = !DISubrange(count: !12, lowerBound: 0)
5428
5429.. _DIEnumerator:
5430
5431DIEnumerator
5432""""""""""""
5433
5434``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5435variants of :ref:`DICompositeType`.
5436
5437.. code-block:: text
5438
5439    !0 = !DIEnumerator(name: "SixKind", value: 7)
5440    !1 = !DIEnumerator(name: "SevenKind", value: 7)
5441    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5442
5443DITemplateTypeParameter
5444"""""""""""""""""""""""
5445
5446``DITemplateTypeParameter`` nodes represent type parameters to generic source
5447language constructs. They are used (optionally) in :ref:`DICompositeType` and
5448:ref:`DISubprogram` ``templateParams:`` fields.
5449
5450.. code-block:: text
5451
5452    !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5453
5454DITemplateValueParameter
5455""""""""""""""""""""""""
5456
5457``DITemplateValueParameter`` nodes represent value parameters to generic source
5458language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5459but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5460``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5461:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5462
5463.. code-block:: text
5464
5465    !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5466
5467DINamespace
5468"""""""""""
5469
5470``DINamespace`` nodes represent namespaces in the source language.
5471
5472.. code-block:: text
5473
5474    !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5475
5476.. _DIGlobalVariable:
5477
5478DIGlobalVariable
5479""""""""""""""""
5480
5481``DIGlobalVariable`` nodes represent global variables in the source language.
5482
5483.. code-block:: text
5484
5485    @foo = global i32, !dbg !0
5486    !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5487    !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5488                           file: !3, line: 7, type: !4, isLocal: true,
5489                           isDefinition: false, declaration: !5)
5490
5491
5492DIGlobalVariableExpression
5493""""""""""""""""""""""""""
5494
5495``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5496with a :ref:`DIExpression`.
5497
5498.. code-block:: text
5499
5500    @lower = global i32, !dbg !0
5501    @upper = global i32, !dbg !1
5502    !0 = !DIGlobalVariableExpression(
5503             var: !2,
5504             expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5505             )
5506    !1 = !DIGlobalVariableExpression(
5507             var: !2,
5508             expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5509             )
5510    !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5511                           file: !4, line: 8, type: !5, declaration: !6)
5512
5513All global variable expressions should be referenced by the `globals:` field of
5514a :ref:`compile unit <DICompileUnit>`.
5515
5516.. _DISubprogram:
5517
5518DISubprogram
5519""""""""""""
5520
5521``DISubprogram`` nodes represent functions from the source language. A distinct
5522``DISubprogram`` may be attached to a function definition using ``!dbg``
5523metadata. A unique ``DISubprogram`` may be attached to a function declaration
5524used for call site debug info. The ``retainedNodes:`` field is a list of
5525:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5526retained, even if their IR counterparts are optimized out of the IR. The
5527``type:`` field must point at an :ref:`DISubroutineType`.
5528
5529.. _DISubprogramDeclaration:
5530
5531When ``isDefinition: false``, subprograms describe a declaration in the type
5532tree as opposed to a definition of a function.  If the scope is a composite
5533type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
5534then the subprogram declaration is uniqued based only on its ``linkageName:``
5535and ``scope:``.
5536
5537.. code-block:: text
5538
5539    define void @_Z3foov() !dbg !0 {
5540      ...
5541    }
5542
5543    !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5544                                file: !2, line: 7, type: !3, isLocal: true,
5545                                isDefinition: true, scopeLine: 8,
5546                                containingType: !4,
5547                                virtuality: DW_VIRTUALITY_pure_virtual,
5548                                virtualIndex: 10, flags: DIFlagPrototyped,
5549                                isOptimized: true, unit: !5, templateParams: !6,
5550                                declaration: !7, retainedNodes: !8,
5551                                thrownTypes: !9)
5552
5553.. _DILexicalBlock:
5554
5555DILexicalBlock
5556""""""""""""""
5557
5558``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5559<DISubprogram>`. The line number and column numbers are used to distinguish
5560two lexical blocks at same depth. They are valid targets for ``scope:``
5561fields.
5562
5563.. code-block:: text
5564
5565    !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5566
5567Usually lexical blocks are ``distinct`` to prevent node merging based on
5568operands.
5569
5570.. _DILexicalBlockFile:
5571
5572DILexicalBlockFile
5573""""""""""""""""""
5574
5575``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5576:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5577indicate textual inclusion, or the ``discriminator:`` field can be used to
5578discriminate between control flow within a single block in the source language.
5579
5580.. code-block:: text
5581
5582    !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5583    !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5584    !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5585
5586.. _DILocation:
5587
5588DILocation
5589""""""""""
5590
5591``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5592mandatory, and points at an :ref:`DILexicalBlockFile`, an
5593:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5594
5595.. code-block:: text
5596
5597    !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5598
5599.. _DILocalVariable:
5600
5601DILocalVariable
5602"""""""""""""""
5603
5604``DILocalVariable`` nodes represent local variables in the source language. If
5605the ``arg:`` field is set to non-zero, then this variable is a subprogram
5606parameter, and it will be included in the ``retainedNodes:`` field of its
5607:ref:`DISubprogram`.
5608
5609.. code-block:: text
5610
5611    !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5612                          type: !3, flags: DIFlagArtificial)
5613    !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5614                          type: !3)
5615    !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5616
5617.. _DIExpression:
5618
5619DIExpression
5620""""""""""""
5621
5622``DIExpression`` nodes represent expressions that are inspired by the DWARF
5623expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5624(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5625referenced LLVM variable relates to the source language variable. Debug
5626intrinsics are interpreted left-to-right: start by pushing the value/address
5627operand of the intrinsic onto a stack, then repeatedly push and evaluate
5628opcodes from the DIExpression until the final variable description is produced.
5629
5630The current supported opcode vocabulary is limited:
5631
5632- ``DW_OP_deref`` dereferences the top of the expression stack.
5633- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5634  them together and appends the result to the expression stack.
5635- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5636  the last entry from the second last entry and appends the result to the
5637  expression stack.
5638- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5639- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5640  here, respectively) of the variable fragment from the working expression. Note
5641  that contrary to DW_OP_bit_piece, the offset is describing the location
5642  within the described source variable.
5643- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5644  (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5645  expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5646  that references a base type constructed from the supplied values.
5647- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5648  optionally applied to the pointer. The memory tag is derived from the
5649  given tag offset in an implementation-defined manner.
5650- ``DW_OP_swap`` swaps top two stack entries.
5651- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5652  of the stack is treated as an address. The second stack entry is treated as an
5653  address space identifier.
5654- ``DW_OP_stack_value`` marks a constant value.
5655- ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the
5656  beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE``
5657  instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a
5658  register is lowered to a ``DW_OP_entry_value [reg]``, pushing the
5659  value the register had upon function entry onto the stack.  The next
5660  ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
5661  block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value,
5662  1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an
5663  expression where the entry value of the debug value instruction's
5664  value/address operand is pushed to the stack, and is added
5665  with 123. Due to framework limitations ``N`` can currently only
5666  be 1.
5667
5668  The operation is introduced by the ``LiveDebugValues`` pass, which
5669  applies it only to function parameters that are unmodified
5670  throughout the function. Support is limited to simple register
5671  location descriptions, or as indirect locations (e.g., when a struct
5672  is passed-by-value to a callee via a pointer to a temporary copy
5673  made in the caller). The entry value op is also introduced by the
5674  ``AsmPrinter`` pass when a call site parameter value
5675  (``DW_AT_call_site_parameter_value``) is represented as entry value
5676  of the parameter.
5677- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
5678  value, such as one that calculates the sum of two registers. This is always
5679  used in combination with an ordered list of values, such that
5680  ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For
5681  example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
5682  DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
5683  ``%reg1 - reg2``. This list of values should be provided by the containing
5684  intrinsic/instruction.
5685- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
5686  signed offset of the specified register. The opcode is only generated by the
5687  ``AsmPrinter`` pass to describe call site parameter value which requires an
5688  expression over two registers.
5689- ``DW_OP_push_object_address`` pushes the address of the object which can then
5690  serve as a descriptor in subsequent calculation. This opcode can be used to
5691  calculate bounds of fortran allocatable array which has array descriptors.
5692- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
5693  of the stack. This opcode can be used to calculate bounds of fortran assumed
5694  rank array which has rank known at run time and current dimension number is
5695  implicitly first element of the stack.
5696- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
5697  be used to represent pointer variables which are optimized out but the value
5698  it points to is known. This operator is required as it is different than DWARF
5699  operator DW_OP_implicit_pointer in representation and specification (number
5700  and types of operands) and later can not be used as multiple level.
5701
5702.. code-block:: text
5703
5704    IR for "*ptr = 4;"
5705    --------------
5706    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
5707    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5708                           type: !18)
5709    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5710    !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5711    !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
5712
5713    IR for "**ptr = 4;"
5714    --------------
5715    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
5716    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5717                           type: !18)
5718    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5719    !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
5720    !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5721    !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
5722                        DW_OP_LLVM_implicit_pointer))
5723
5724DWARF specifies three kinds of simple location descriptions: Register, memory,
5725and implicit location descriptions.  Note that a location description is
5726defined over certain ranges of a program, i.e the location of a variable may
5727change over the course of the program. Register and memory location
5728descriptions describe the *concrete location* of a source variable (in the
5729sense that a debugger might modify its value), whereas *implicit locations*
5730describe merely the actual *value* of a source variable which might not exist
5731in registers or in memory (see ``DW_OP_stack_value``).
5732
5733A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
5734value (the address) of a source variable. The first operand of the intrinsic
5735must be an address of some kind. A DIExpression attached to the intrinsic
5736refines this address to produce a concrete location for the source variable.
5737
5738A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
5739The first operand of the intrinsic may be a direct or indirect value. A
5740DIExpression attached to the intrinsic refines the first operand to produce a
5741direct value. For example, if the first operand is an indirect value, it may be
5742necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
5743valid debug intrinsic.
5744
5745.. note::
5746
5747   A DIExpression is interpreted in the same way regardless of which kind of
5748   debug intrinsic it's attached to.
5749
5750.. code-block:: text
5751
5752    !0 = !DIExpression(DW_OP_deref)
5753    !1 = !DIExpression(DW_OP_plus_uconst, 3)
5754    !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
5755    !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
5756    !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
5757    !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
5758    !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
5759
5760DIArgList
5761""""""""""""
5762
5763``DIArgList`` nodes hold a list of constant or SSA value references. These are
5764used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
5765``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
5766``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
5767within a function, it must only be used as a function argument, must always be
5768inlined, and cannot appear in named metadata.
5769
5770.. code-block:: text
5771
5772    llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
5773                   metadata !16,
5774                   metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
5775
5776DIFlags
5777"""""""""""""""
5778
5779These flags encode various properties of DINodes.
5780
5781The `ExportSymbols` flag marks a class, struct or union whose members
5782may be referenced as if they were defined in the containing class or
5783union. This flag is used to decide whether the DW_AT_export_symbols can
5784be used for the structure type.
5785
5786DIObjCProperty
5787""""""""""""""
5788
5789``DIObjCProperty`` nodes represent Objective-C property nodes.
5790
5791.. code-block:: text
5792
5793    !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
5794                         getter: "getFoo", attributes: 7, type: !2)
5795
5796DIImportedEntity
5797""""""""""""""""
5798
5799``DIImportedEntity`` nodes represent entities (such as modules) imported into a
5800compile unit. The ``elements`` field is a list of renamed entities (such as
5801variables and subprograms) in the imported entity (such as module).
5802
5803.. code-block:: text
5804
5805   !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
5806                          entity: !1, line: 7, elements: !3)
5807   !3 = !{!4}
5808   !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
5809                          entity: !5, line: 7)
5810
5811DIMacro
5812"""""""
5813
5814``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
5815The ``name:`` field is the macro identifier, followed by macro parameters when
5816defining a function-like macro, and the ``value`` field is the token-string
5817used to expand the macro identifier.
5818
5819.. code-block:: text
5820
5821   !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
5822                 value: "((x) + 1)")
5823   !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
5824
5825DIMacroFile
5826"""""""""""
5827
5828``DIMacroFile`` nodes represent inclusion of source files.
5829The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
5830appear in the included source file.
5831
5832.. code-block:: text
5833
5834   !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
5835                     nodes: !3)
5836
5837.. _DILabel:
5838
5839DILabel
5840"""""""
5841
5842``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
5843a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
5844:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
5845The ``name:`` field is the label identifier. The ``file:`` field is the
5846:ref:`DIFile` the label is present in. The ``line:`` field is the source line
5847within the file where the label is declared.
5848
5849.. code-block:: text
5850
5851  !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
5852
5853'``tbaa``' Metadata
5854^^^^^^^^^^^^^^^^^^^
5855
5856In LLVM IR, memory does not have types, so LLVM's own type system is not
5857suitable for doing type based alias analysis (TBAA). Instead, metadata is
5858added to the IR to describe a type system of a higher level language. This
5859can be used to implement C/C++ strict type aliasing rules, but it can also
5860be used to implement custom alias analysis behavior for other languages.
5861
5862This description of LLVM's TBAA system is broken into two parts:
5863:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
5864:ref:`Representation<tbaa_node_representation>` talks about the metadata
5865encoding of various entities.
5866
5867It is always possible to trace any TBAA node to a "root" TBAA node (details
5868in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
5869nodes with different roots have an unknown aliasing relationship, and LLVM
5870conservatively infers ``MayAlias`` between them.  The rules mentioned in
5871this section only pertain to TBAA nodes living under the same root.
5872
5873.. _tbaa_node_semantics:
5874
5875Semantics
5876"""""""""
5877
5878The TBAA metadata system, referred to as "struct path TBAA" (not to be
5879confused with ``tbaa.struct``), consists of the following high level
5880concepts: *Type Descriptors*, further subdivided into scalar type
5881descriptors and struct type descriptors; and *Access Tags*.
5882
5883**Type descriptors** describe the type system of the higher level language
5884being compiled.  **Scalar type descriptors** describe types that do not
5885contain other types.  Each scalar type has a parent type, which must also
5886be a scalar type or the TBAA root.  Via this parent relation, scalar types
5887within a TBAA root form a tree.  **Struct type descriptors** denote types
5888that contain a sequence of other type descriptors, at known offsets.  These
5889contained type descriptors can either be struct type descriptors themselves
5890or scalar type descriptors.
5891
5892**Access tags** are metadata nodes attached to load and store instructions.
5893Access tags use type descriptors to describe the *location* being accessed
5894in terms of the type system of the higher level language.  Access tags are
5895tuples consisting of a base type, an access type and an offset.  The base
5896type is a scalar type descriptor or a struct type descriptor, the access
5897type is a scalar type descriptor, and the offset is a constant integer.
5898
5899The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
5900things:
5901
5902 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
5903   or store) of a value of type ``AccessTy`` contained in the struct type
5904   ``BaseTy`` at offset ``Offset``.
5905
5906 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
5907   ``AccessTy`` must be the same; and the access tag describes a scalar
5908   access with scalar type ``AccessTy``.
5909
5910We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
5911tuples this way:
5912
5913 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
5914   ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
5915   described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
5916   undefined if ``Offset`` is non-zero.
5917
5918 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
5919   is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
5920   ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
5921   to be relative within that inner type.
5922
5923A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
5924aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
5925Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
5926Offset2)`` via the ``Parent`` relation or vice versa.
5927
5928As a concrete example, the type descriptor graph for the following program
5929
5930.. code-block:: c
5931
5932    struct Inner {
5933      int i;    // offset 0
5934      float f;  // offset 4
5935    };
5936
5937    struct Outer {
5938      float f;  // offset 0
5939      double d; // offset 4
5940      struct Inner inner_a;  // offset 12
5941    };
5942
5943    void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
5944      outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
5945      outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
5946      outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
5947      *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
5948    }
5949
5950is (note that in C and C++, ``char`` can be used to access any arbitrary
5951type):
5952
5953.. code-block:: text
5954
5955    Root = "TBAA Root"
5956    CharScalarTy = ("char", Root, 0)
5957    FloatScalarTy = ("float", CharScalarTy, 0)
5958    DoubleScalarTy = ("double", CharScalarTy, 0)
5959    IntScalarTy = ("int", CharScalarTy, 0)
5960    InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
5961    OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
5962                     (InnerStructTy, 12)}
5963
5964
5965with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
59660)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
5967``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
5968
5969.. _tbaa_node_representation:
5970
5971Representation
5972""""""""""""""
5973
5974The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
5975with exactly one ``MDString`` operand.
5976
5977Scalar type descriptors are represented as an ``MDNode`` s with two
5978operands.  The first operand is an ``MDString`` denoting the name of the
5979struct type.  LLVM does not assign meaning to the value of this operand, it
5980only cares about it being an ``MDString``.  The second operand is an
5981``MDNode`` which points to the parent for said scalar type descriptor,
5982which is either another scalar type descriptor or the TBAA root.  Scalar
5983type descriptors can have an optional third argument, but that must be the
5984constant integer zero.
5985
5986Struct type descriptors are represented as ``MDNode`` s with an odd number
5987of operands greater than 1.  The first operand is an ``MDString`` denoting
5988the name of the struct type.  Like in scalar type descriptors the actual
5989value of this name operand is irrelevant to LLVM.  After the name operand,
5990the struct type descriptors have a sequence of alternating ``MDNode`` and
5991``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
5992an ``MDNode``, denotes a contained field, and the 2N th operand, a
5993``ConstantInt``, is the offset of the said contained field.  The offsets
5994must be in non-decreasing order.
5995
5996Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
5997The first operand is an ``MDNode`` pointing to the node representing the
5998base type.  The second operand is an ``MDNode`` pointing to the node
5999representing the access type.  The third operand is a ``ConstantInt`` that
6000states the offset of the access.  If a fourth field is present, it must be
6001a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
6002that the location being accessed is "constant" (meaning
6003``pointsToConstantMemory`` should return true; see `other useful
6004AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
6005the access type and the base type of an access tag must be the same, and
6006that is the TBAA root of the access tag.
6007
6008'``tbaa.struct``' Metadata
6009^^^^^^^^^^^^^^^^^^^^^^^^^^
6010
6011The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6012aggregate assignment operations in C and similar languages, however it
6013is defined to copy a contiguous region of memory, which is more than
6014strictly necessary for aggregate types which contain holes due to
6015padding. Also, it doesn't contain any TBAA information about the fields
6016of the aggregate.
6017
6018``!tbaa.struct`` metadata can describe which memory subregions in a
6019memcpy are padding and what the TBAA tags of the struct are.
6020
6021The current metadata format is very simple. ``!tbaa.struct`` metadata
6022nodes are a list of operands which are in conceptual groups of three.
6023For each group of three, the first operand gives the byte offset of a
6024field in bytes, the second gives its size in bytes, and the third gives
6025its tbaa tag. e.g.:
6026
6027.. code-block:: llvm
6028
6029    !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6030
6031This describes a struct with two fields. The first is at offset 0 bytes
6032with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6033and has size 4 bytes and has tbaa tag !2.
6034
6035Note that the fields need not be contiguous. In this example, there is a
60364 byte gap between the two fields. This gap represents padding which
6037does not carry useful data and need not be preserved.
6038
6039'``noalias``' and '``alias.scope``' Metadata
6040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6041
6042``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6043noalias memory-access sets. This means that some collection of memory access
6044instructions (loads, stores, memory-accessing calls, etc.) that carry
6045``noalias`` metadata can specifically be specified not to alias with some other
6046collection of memory access instructions that carry ``alias.scope`` metadata.
6047Each type of metadata specifies a list of scopes where each scope has an id and
6048a domain.
6049
6050When evaluating an aliasing query, if for some domain, the set
6051of scopes with that domain in one instruction's ``alias.scope`` list is a
6052subset of (or equal to) the set of scopes for that domain in another
6053instruction's ``noalias`` list, then the two memory accesses are assumed not to
6054alias.
6055
6056Because scopes in one domain don't affect scopes in other domains, separate
6057domains can be used to compose multiple independent noalias sets.  This is
6058used for example during inlining.  As the noalias function parameters are
6059turned into noalias scope metadata, a new domain is used every time the
6060function is inlined.
6061
6062The metadata identifying each domain is itself a list containing one or two
6063entries. The first entry is the name of the domain. Note that if the name is a
6064string then it can be combined across functions and translation units. A
6065self-reference can be used to create globally unique domain names. A
6066descriptive string may optionally be provided as a second list entry.
6067
6068The metadata identifying each scope is also itself a list containing two or
6069three entries. The first entry is the name of the scope. Note that if the name
6070is a string then it can be combined across functions and translation units. A
6071self-reference can be used to create globally unique scope names. A metadata
6072reference to the scope's domain is the second entry. A descriptive string may
6073optionally be provided as a third list entry.
6074
6075For example,
6076
6077.. code-block:: llvm
6078
6079    ; Two scope domains:
6080    !0 = !{!0}
6081    !1 = !{!1}
6082
6083    ; Some scopes in these domains:
6084    !2 = !{!2, !0}
6085    !3 = !{!3, !0}
6086    !4 = !{!4, !1}
6087
6088    ; Some scope lists:
6089    !5 = !{!4} ; A list containing only scope !4
6090    !6 = !{!4, !3, !2}
6091    !7 = !{!3}
6092
6093    ; These two instructions don't alias:
6094    %0 = load float, float* %c, align 4, !alias.scope !5
6095    store float %0, float* %arrayidx.i, align 4, !noalias !5
6096
6097    ; These two instructions also don't alias (for domain !1, the set of scopes
6098    ; in the !alias.scope equals that in the !noalias list):
6099    %2 = load float, float* %c, align 4, !alias.scope !5
6100    store float %2, float* %arrayidx.i2, align 4, !noalias !6
6101
6102    ; These two instructions may alias (for domain !0, the set of scopes in
6103    ; the !noalias list is not a superset of, or equal to, the scopes in the
6104    ; !alias.scope list):
6105    %2 = load float, float* %c, align 4, !alias.scope !6
6106    store float %0, float* %arrayidx.i, align 4, !noalias !7
6107
6108'``fpmath``' Metadata
6109^^^^^^^^^^^^^^^^^^^^^
6110
6111``fpmath`` metadata may be attached to any instruction of floating-point
6112type. It can be used to express the maximum acceptable error in the
6113result of that instruction, in ULPs, thus potentially allowing the
6114compiler to use a more efficient but less accurate method of computing
6115it. ULP is defined as follows:
6116
6117    If ``x`` is a real number that lies between two finite consecutive
6118    floating-point numbers ``a`` and ``b``, without being equal to one
6119    of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6120    distance between the two non-equal finite floating-point numbers
6121    nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6122
6123The metadata node shall consist of a single positive float type number
6124representing the maximum relative error, for example:
6125
6126.. code-block:: llvm
6127
6128    !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6129
6130.. _range-metadata:
6131
6132'``range``' Metadata
6133^^^^^^^^^^^^^^^^^^^^
6134
6135``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6136integer types. It expresses the possible ranges the loaded value or the value
6137returned by the called function at this call site is in. If the loaded or
6138returned value is not in the specified range, the behavior is undefined. The
6139ranges are represented with a flattened list of integers. The loaded value or
6140the value returned is known to be in the union of the ranges defined by each
6141consecutive pair. Each pair has the following properties:
6142
6143-  The type must match the type loaded by the instruction.
6144-  The pair ``a,b`` represents the range ``[a,b)``.
6145-  Both ``a`` and ``b`` are constants.
6146-  The range is allowed to wrap.
6147-  The range should not represent the full or empty set. That is,
6148   ``a!=b``.
6149
6150In addition, the pairs must be in signed order of the lower bound and
6151they must be non-contiguous.
6152
6153Examples:
6154
6155.. code-block:: llvm
6156
6157      %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
6158      %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6159      %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
6160      %d = invoke i8 @bar() to label %cont
6161             unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6162    ...
6163    !0 = !{ i8 0, i8 2 }
6164    !1 = !{ i8 255, i8 2 }
6165    !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6166    !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6167
6168'``absolute_symbol``' Metadata
6169^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6170
6171``absolute_symbol`` metadata may be attached to a global variable
6172declaration. It marks the declaration as a reference to an absolute symbol,
6173which causes the backend to use absolute relocations for the symbol even
6174in position independent code, and expresses the possible ranges that the
6175global variable's *address* (not its value) is in, in the same format as
6176``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6177may be used to represent the full set.
6178
6179Example (assuming 64-bit pointers):
6180
6181.. code-block:: llvm
6182
6183      @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6184      @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6185
6186    ...
6187    !0 = !{ i64 0, i64 256 }
6188    !1 = !{ i64 -1, i64 -1 }
6189
6190'``callees``' Metadata
6191^^^^^^^^^^^^^^^^^^^^^^
6192
6193``callees`` metadata may be attached to indirect call sites. If ``callees``
6194metadata is attached to a call site, and any callee is not among the set of
6195functions provided by the metadata, the behavior is undefined. The intent of
6196this metadata is to facilitate optimizations such as indirect-call promotion.
6197For example, in the code below, the call instruction may only target the
6198``add`` or ``sub`` functions:
6199
6200.. code-block:: llvm
6201
6202    %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6203
6204    ...
6205    !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
6206
6207'``callback``' Metadata
6208^^^^^^^^^^^^^^^^^^^^^^^
6209
6210``callback`` metadata may be attached to a function declaration, or definition.
6211(Call sites are excluded only due to the lack of a use case.) For ease of
6212exposition, we'll refer to the function annotated w/ metadata as a broker
6213function. The metadata describes how the arguments of a call to the broker are
6214in turn passed to the callback function specified by the metadata. Thus, the
6215``callback`` metadata provides a partial description of a call site inside the
6216broker function with regards to the arguments of a call to the broker. The only
6217semantic restriction on the broker function itself is that it is not allowed to
6218inspect or modify arguments referenced in the ``callback`` metadata as
6219pass-through to the callback function.
6220
6221The broker is not required to actually invoke the callback function at runtime.
6222However, the assumptions about not inspecting or modifying arguments that would
6223be passed to the specified callback function still hold, even if the callback
6224function is not dynamically invoked. The broker is allowed to invoke the
6225callback function more than once per invocation of the broker. The broker is
6226also allowed to invoke (directly or indirectly) the function passed as a
6227callback through another use. Finally, the broker is also allowed to relay the
6228callback callee invocation to a different thread.
6229
6230The metadata is structured as follows: At the outer level, ``callback``
6231metadata is a list of ``callback`` encodings. Each encoding starts with a
6232constant ``i64`` which describes the argument position of the callback function
6233in the call to the broker. The following elements, except the last, describe
6234what arguments are passed to the callback function. Each element is again an
6235``i64`` constant identifying the argument of the broker that is passed through,
6236or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6237they are listed has to be the same in which they are passed to the callback
6238callee. The last element of the encoding is a boolean which specifies how
6239variadic arguments of the broker are handled. If it is true, all variadic
6240arguments of the broker are passed through to the callback function *after* the
6241arguments encoded explicitly before.
6242
6243In the code below, the ``pthread_create`` function is marked as a broker
6244through the ``!callback !1`` metadata. In the example, there is only one
6245callback encoding, namely ``!2``, associated with the broker. This encoding
6246identifies the callback function as the second argument of the broker (``i64
62472``) and the sole argument of the callback function as the third one of the
6248broker function (``i64 3``).
6249
6250.. FIXME why does the llvm-sphinx-docs builder give a highlighting
6251   error if the below is set to highlight as 'llvm', despite that we
6252   have misc.highlighting_failure set?
6253
6254.. code-block:: text
6255
6256    declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*)
6257
6258    ...
6259    !2 = !{i64 2, i64 3, i1 false}
6260    !1 = !{!2}
6261
6262Another example is shown below. The callback callee is the second argument of
6263the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6264values (each identified by a ``i64 -1``) and afterwards all
6265variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6266final ``i1 true``).
6267
6268.. FIXME why does the llvm-sphinx-docs builder give a highlighting
6269   error if the below is set to highlight as 'llvm', despite that we
6270   have misc.highlighting_failure set?
6271
6272.. code-block:: text
6273
6274    declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...)
6275
6276    ...
6277    !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6278    !0 = !{!1}
6279
6280
6281'``unpredictable``' Metadata
6282^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6283
6284``unpredictable`` metadata may be attached to any branch or switch
6285instruction. It can be used to express the unpredictability of control
6286flow. Similar to the llvm.expect intrinsic, it may be used to alter
6287optimizations related to compare and branch instructions. The metadata
6288is treated as a boolean value; if it exists, it signals that the branch
6289or switch that it is attached to is completely unpredictable.
6290
6291.. _md_dereferenceable:
6292
6293'``dereferenceable``' Metadata
6294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6295
6296The existence of the ``!dereferenceable`` metadata on the instruction
6297tells the optimizer that the value loaded is known to be dereferenceable.
6298The number of bytes known to be dereferenceable is specified by the integer
6299value in the metadata node. This is analogous to the ''dereferenceable''
6300attribute on parameters and return values.
6301
6302.. _md_dereferenceable_or_null:
6303
6304'``dereferenceable_or_null``' Metadata
6305^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6306
6307The existence of the ``!dereferenceable_or_null`` metadata on the
6308instruction tells the optimizer that the value loaded is known to be either
6309dereferenceable or null.
6310The number of bytes known to be dereferenceable is specified by the integer
6311value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6312attribute on parameters and return values.
6313
6314.. _llvm.loop:
6315
6316'``llvm.loop``'
6317^^^^^^^^^^^^^^^
6318
6319It is sometimes useful to attach information to loop constructs. Currently,
6320loop metadata is implemented as metadata attached to the branch instruction
6321in the loop latch block. The loop metadata node is a list of
6322other metadata nodes, each representing a property of the loop. Usually,
6323the first item of the property node is a string. For example, the
6324``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6325unroller:
6326
6327.. code-block:: llvm
6328
6329      br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6330    ...
6331    !0 = !{!0, !1, !2}
6332    !1 = !{!"llvm.loop.unroll.enable"}
6333    !2 = !{!"llvm.loop.unroll.count", i32 4}
6334
6335For legacy reasons, the first item of a loop metadata node must be a
6336reference to itself. Before the advent of the 'distinct' keyword, this
6337forced the preservation of otherwise identical metadata nodes. Since
6338the loop-metadata node can be attached to multiple nodes, the 'distinct'
6339keyword has become unnecessary.
6340
6341Prior to the property nodes, one or two ``DILocation`` (debug location)
6342nodes can be present in the list. The first, if present, identifies the
6343source-code location where the loop begins. The second, if present,
6344identifies the source-code location where the loop ends.
6345
6346Loop metadata nodes cannot be used as unique identifiers. They are
6347neither persistent for the same loop through transformations nor
6348necessarily unique to just one loop.
6349
6350'``llvm.loop.disable_nonforced``'
6351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6352
6353This metadata disables all optional loop transformations unless
6354explicitly instructed using other transformation metadata such as
6355``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6356whether a transformation is profitable. The purpose is to avoid that the
6357loop is transformed to a different loop before an explicitly requested
6358(forced) transformation is applied. For instance, loop fusion can make
6359other transformations impossible. Mandatory loop canonicalizations such
6360as loop rotation are still applied.
6361
6362It is recommended to use this metadata in addition to any llvm.loop.*
6363transformation directive. Also, any loop should have at most one
6364directive applied to it (and a sequence of transformations built using
6365followup-attributes). Otherwise, which transformation will be applied
6366depends on implementation details such as the pass pipeline order.
6367
6368See :ref:`transformation-metadata` for details.
6369
6370'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6372
6373Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6374used to control per-loop vectorization and interleaving parameters such as
6375vectorization width and interleave count. These metadata should be used in
6376conjunction with ``llvm.loop`` loop identification metadata. The
6377``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6378optimization hints and the optimizer will only interleave and vectorize loops if
6379it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6380which contains information about loop-carried memory dependencies can be helpful
6381in determining the safety of these transformations.
6382
6383'``llvm.loop.interleave.count``' Metadata
6384^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6385
6386This metadata suggests an interleave count to the loop interleaver.
6387The first operand is the string ``llvm.loop.interleave.count`` and the
6388second operand is an integer specifying the interleave count. For
6389example:
6390
6391.. code-block:: llvm
6392
6393   !0 = !{!"llvm.loop.interleave.count", i32 4}
6394
6395Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6396multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6397then the interleave count will be determined automatically.
6398
6399'``llvm.loop.vectorize.enable``' Metadata
6400^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6401
6402This metadata selectively enables or disables vectorization for the loop. The
6403first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6404is a bit. If the bit operand value is 1 vectorization is enabled. A value of
64050 disables vectorization:
6406
6407.. code-block:: llvm
6408
6409   !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6410   !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6411
6412'``llvm.loop.vectorize.predicate.enable``' Metadata
6413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6414
6415This metadata selectively enables or disables creating predicated instructions
6416for the loop, which can enable folding of the scalar epilogue loop into the
6417main loop. The first operand is the string
6418``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6419the bit operand value is 1 vectorization is enabled. A value of 0 disables
6420vectorization:
6421
6422.. code-block:: llvm
6423
6424   !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6425   !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6426
6427'``llvm.loop.vectorize.scalable.enable``' Metadata
6428^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6429
6430This metadata selectively enables or disables scalable vectorization for the
6431loop, and only has any effect if vectorization for the loop is already enabled.
6432The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6433and the second operand is a bit. If the bit operand value is 1 scalable
6434vectorization is enabled, whereas a value of 0 reverts to the default fixed
6435width vectorization:
6436
6437.. code-block:: llvm
6438
6439   !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6440   !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6441
6442'``llvm.loop.vectorize.width``' Metadata
6443^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6444
6445This metadata sets the target width of the vectorizer. The first
6446operand is the string ``llvm.loop.vectorize.width`` and the second
6447operand is an integer specifying the width. For example:
6448
6449.. code-block:: llvm
6450
6451   !0 = !{!"llvm.loop.vectorize.width", i32 4}
6452
6453Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6454vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
64550 or if the loop does not have this metadata the width will be
6456determined automatically.
6457
6458'``llvm.loop.vectorize.followup_vectorized``' Metadata
6459^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6460
6461This metadata defines which loop attributes the vectorized loop will
6462have. See :ref:`transformation-metadata` for details.
6463
6464'``llvm.loop.vectorize.followup_epilogue``' Metadata
6465^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6466
6467This metadata defines which loop attributes the epilogue will have. The
6468epilogue is not vectorized and is executed when either the vectorized
6469loop is not known to preserve semantics (because e.g., it processes two
6470arrays that are found to alias by a runtime check) or for the last
6471iterations that do not fill a complete set of vector lanes. See
6472:ref:`Transformation Metadata <transformation-metadata>` for details.
6473
6474'``llvm.loop.vectorize.followup_all``' Metadata
6475^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6476
6477Attributes in the metadata will be added to both the vectorized and
6478epilogue loop.
6479See :ref:`Transformation Metadata <transformation-metadata>` for details.
6480
6481'``llvm.loop.unroll``'
6482^^^^^^^^^^^^^^^^^^^^^^
6483
6484Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6485optimization hints such as the unroll factor. ``llvm.loop.unroll``
6486metadata should be used in conjunction with ``llvm.loop`` loop
6487identification metadata. The ``llvm.loop.unroll`` metadata are only
6488optimization hints and the unrolling will only be performed if the
6489optimizer believes it is safe to do so.
6490
6491'``llvm.loop.unroll.count``' Metadata
6492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6493
6494This metadata suggests an unroll factor to the loop unroller. The
6495first operand is the string ``llvm.loop.unroll.count`` and the second
6496operand is a positive integer specifying the unroll factor. For
6497example:
6498
6499.. code-block:: llvm
6500
6501   !0 = !{!"llvm.loop.unroll.count", i32 4}
6502
6503If the trip count of the loop is less than the unroll count the loop
6504will be partially unrolled.
6505
6506'``llvm.loop.unroll.disable``' Metadata
6507^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6508
6509This metadata disables loop unrolling. The metadata has a single operand
6510which is the string ``llvm.loop.unroll.disable``. For example:
6511
6512.. code-block:: llvm
6513
6514   !0 = !{!"llvm.loop.unroll.disable"}
6515
6516'``llvm.loop.unroll.runtime.disable``' Metadata
6517^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6518
6519This metadata disables runtime loop unrolling. The metadata has a single
6520operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6521
6522.. code-block:: llvm
6523
6524   !0 = !{!"llvm.loop.unroll.runtime.disable"}
6525
6526'``llvm.loop.unroll.enable``' Metadata
6527^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6528
6529This metadata suggests that the loop should be fully unrolled if the trip count
6530is known at compile time and partially unrolled if the trip count is not known
6531at compile time. The metadata has a single operand which is the string
6532``llvm.loop.unroll.enable``.  For example:
6533
6534.. code-block:: llvm
6535
6536   !0 = !{!"llvm.loop.unroll.enable"}
6537
6538'``llvm.loop.unroll.full``' Metadata
6539^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6540
6541This metadata suggests that the loop should be unrolled fully. The
6542metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6543For example:
6544
6545.. code-block:: llvm
6546
6547   !0 = !{!"llvm.loop.unroll.full"}
6548
6549'``llvm.loop.unroll.followup``' Metadata
6550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6551
6552This metadata defines which loop attributes the unrolled loop will have.
6553See :ref:`Transformation Metadata <transformation-metadata>` for details.
6554
6555'``llvm.loop.unroll.followup_remainder``' Metadata
6556^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6557
6558This metadata defines which loop attributes the remainder loop after
6559partial/runtime unrolling will have. See
6560:ref:`Transformation Metadata <transformation-metadata>` for details.
6561
6562'``llvm.loop.unroll_and_jam``'
6563^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6564
6565This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6566above, but affect the unroll and jam pass. In addition any loop with
6567``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6568disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6569unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6570too.)
6571
6572The metadata for unroll and jam otherwise is the same as for ``unroll``.
6573``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6574``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6575``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6576and the normal safety checks will still be performed.
6577
6578'``llvm.loop.unroll_and_jam.count``' Metadata
6579^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6580
6581This metadata suggests an unroll and jam factor to use, similarly to
6582``llvm.loop.unroll.count``. The first operand is the string
6583``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6584specifying the unroll factor. For example:
6585
6586.. code-block:: llvm
6587
6588   !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6589
6590If the trip count of the loop is less than the unroll count the loop
6591will be partially unroll and jammed.
6592
6593'``llvm.loop.unroll_and_jam.disable``' Metadata
6594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6595
6596This metadata disables loop unroll and jamming. The metadata has a single
6597operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6598
6599.. code-block:: llvm
6600
6601   !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6602
6603'``llvm.loop.unroll_and_jam.enable``' Metadata
6604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6605
6606This metadata suggests that the loop should be fully unroll and jammed if the
6607trip count is known at compile time and partially unrolled if the trip count is
6608not known at compile time. The metadata has a single operand which is the
6609string ``llvm.loop.unroll_and_jam.enable``.  For example:
6610
6611.. code-block:: llvm
6612
6613   !0 = !{!"llvm.loop.unroll_and_jam.enable"}
6614
6615'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
6616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6617
6618This metadata defines which loop attributes the outer unrolled loop will
6619have. See :ref:`Transformation Metadata <transformation-metadata>` for
6620details.
6621
6622'``llvm.loop.unroll_and_jam.followup_inner``' Metadata
6623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6624
6625This metadata defines which loop attributes the inner jammed loop will
6626have. See :ref:`Transformation Metadata <transformation-metadata>` for
6627details.
6628
6629'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
6630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6631
6632This metadata defines which attributes the epilogue of the outer loop
6633will have. This loop is usually unrolled, meaning there is no such
6634loop. This attribute will be ignored in this case. See
6635:ref:`Transformation Metadata <transformation-metadata>` for details.
6636
6637'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
6638^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6639
6640This metadata defines which attributes the inner loop of the epilogue
6641will have. The outer epilogue will usually be unrolled, meaning there
6642can be multiple inner remainder loops. See
6643:ref:`Transformation Metadata <transformation-metadata>` for details.
6644
6645'``llvm.loop.unroll_and_jam.followup_all``' Metadata
6646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6647
6648Attributes specified in the metadata is added to all
6649``llvm.loop.unroll_and_jam.*`` loops. See
6650:ref:`Transformation Metadata <transformation-metadata>` for details.
6651
6652'``llvm.loop.licm_versioning.disable``' Metadata
6653^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6654
6655This metadata indicates that the loop should not be versioned for the purpose
6656of enabling loop-invariant code motion (LICM). The metadata has a single operand
6657which is the string ``llvm.loop.licm_versioning.disable``. For example:
6658
6659.. code-block:: llvm
6660
6661   !0 = !{!"llvm.loop.licm_versioning.disable"}
6662
6663'``llvm.loop.distribute.enable``' Metadata
6664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6665
6666Loop distribution allows splitting a loop into multiple loops.  Currently,
6667this is only performed if the entire loop cannot be vectorized due to unsafe
6668memory dependencies.  The transformation will attempt to isolate the unsafe
6669dependencies into their own loop.
6670
6671This metadata can be used to selectively enable or disable distribution of the
6672loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
6673second operand is a bit. If the bit operand value is 1 distribution is
6674enabled. A value of 0 disables distribution:
6675
6676.. code-block:: llvm
6677
6678   !0 = !{!"llvm.loop.distribute.enable", i1 0}
6679   !1 = !{!"llvm.loop.distribute.enable", i1 1}
6680
6681This metadata should be used in conjunction with ``llvm.loop`` loop
6682identification metadata.
6683
6684'``llvm.loop.distribute.followup_coincident``' Metadata
6685^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6686
6687This metadata defines which attributes extracted loops with no cyclic
6688dependencies will have (i.e. can be vectorized). See
6689:ref:`Transformation Metadata <transformation-metadata>` for details.
6690
6691'``llvm.loop.distribute.followup_sequential``' Metadata
6692^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6693
6694This metadata defines which attributes the isolated loops with unsafe
6695memory dependencies will have. See
6696:ref:`Transformation Metadata <transformation-metadata>` for details.
6697
6698'``llvm.loop.distribute.followup_fallback``' Metadata
6699^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6700
6701If loop versioning is necessary, this metadata defined the attributes
6702the non-distributed fallback version will have. See
6703:ref:`Transformation Metadata <transformation-metadata>` for details.
6704
6705'``llvm.loop.distribute.followup_all``' Metadata
6706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6707
6708The attributes in this metadata is added to all followup loops of the
6709loop distribution pass. See
6710:ref:`Transformation Metadata <transformation-metadata>` for details.
6711
6712'``llvm.licm.disable``' Metadata
6713^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6714
6715This metadata indicates that loop-invariant code motion (LICM) should not be
6716performed on this loop. The metadata has a single operand which is the string
6717``llvm.licm.disable``. For example:
6718
6719.. code-block:: llvm
6720
6721   !0 = !{!"llvm.licm.disable"}
6722
6723Note that although it operates per loop it isn't given the llvm.loop prefix
6724as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
6725
6726'``llvm.access.group``' Metadata
6727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6728
6729``llvm.access.group`` metadata can be attached to any instruction that
6730potentially accesses memory. It can point to a single distinct metadata
6731node, which we call access group. This node represents all memory access
6732instructions referring to it via ``llvm.access.group``. When an
6733instruction belongs to multiple access groups, it can also point to a
6734list of accesses groups, illustrated by the following example.
6735
6736.. code-block:: llvm
6737
6738   %val = load i32, i32* %arrayidx, !llvm.access.group !0
6739   ...
6740   !0 = !{!1, !2}
6741   !1 = distinct !{}
6742   !2 = distinct !{}
6743
6744It is illegal for the list node to be empty since it might be confused
6745with an access group.
6746
6747The access group metadata node must be 'distinct' to avoid collapsing
6748multiple access groups by content. A access group metadata node must
6749always be empty which can be used to distinguish an access group
6750metadata node from a list of access groups. Being empty avoids the
6751situation that the content must be updated which, because metadata is
6752immutable by design, would required finding and updating all references
6753to the access group node.
6754
6755The access group can be used to refer to a memory access instruction
6756without pointing to it directly (which is not possible in global
6757metadata). Currently, the only metadata making use of it is
6758``llvm.loop.parallel_accesses``.
6759
6760'``llvm.loop.parallel_accesses``' Metadata
6761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6762
6763The ``llvm.loop.parallel_accesses`` metadata refers to one or more
6764access group metadata nodes (see ``llvm.access.group``). It denotes that
6765no loop-carried memory dependence exist between it and other instructions
6766in the loop with this metadata.
6767
6768Let ``m1`` and ``m2`` be two instructions that both have the
6769``llvm.access.group`` metadata to the access group ``g1``, respectively
6770``g2`` (which might be identical). If a loop contains both access groups
6771in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
6772assume that there is no dependency between ``m1`` and ``m2`` carried by
6773this loop. Instructions that belong to multiple access groups are
6774considered having this property if at least one of the access groups
6775matches the ``llvm.loop.parallel_accesses`` list.
6776
6777If all memory-accessing instructions in a loop have
6778``llvm.access.group`` metadata that each refer to one of the access
6779groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
6780loop has no loop carried memory dependences and is considered to be a
6781parallel loop.
6782
6783Note that if not all memory access instructions belong to an access
6784group referred to by ``llvm.loop.parallel_accesses``, then the loop must
6785not be considered trivially parallel. Additional
6786memory dependence analysis is required to make that determination. As a fail
6787safe mechanism, this causes loops that were originally parallel to be considered
6788sequential (if optimization passes that are unaware of the parallel semantics
6789insert new memory instructions into the loop body).
6790
6791Example of a loop that is considered parallel due to its correct use of
6792both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
6793metadata types.
6794
6795.. code-block:: llvm
6796
6797   for.body:
6798     ...
6799     %val0 = load i32, i32* %arrayidx, !llvm.access.group !1
6800     ...
6801     store i32 %val0, i32* %arrayidx1, !llvm.access.group !1
6802     ...
6803     br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
6804
6805   for.end:
6806   ...
6807   !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
6808   !1 = distinct !{}
6809
6810It is also possible to have nested parallel loops:
6811
6812.. code-block:: llvm
6813
6814   outer.for.body:
6815     ...
6816     %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4
6817     ...
6818     br label %inner.for.body
6819
6820   inner.for.body:
6821     ...
6822     %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3
6823     ...
6824     store i32 %val0, i32* %arrayidx2, !llvm.access.group !3
6825     ...
6826     br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
6827
6828   inner.for.end:
6829     ...
6830     store i32 %val1, i32* %arrayidx4, !llvm.access.group !4
6831     ...
6832     br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
6833
6834   outer.for.end:                                          ; preds = %for.body
6835   ...
6836   !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
6837   !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
6838   !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
6839   !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
6840
6841'``llvm.loop.mustprogress``' Metadata
6842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6843
6844The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
6845terminate, unwind, or interact with the environment in an observable way e.g.
6846via a volatile memory access, I/O, or other synchronization. If such a loop is
6847not found to interact with the environment in an observable way, the loop may
6848be removed. This corresponds to the ``mustprogress`` function attribute.
6849
6850'``irr_loop``' Metadata
6851^^^^^^^^^^^^^^^^^^^^^^^
6852
6853``irr_loop`` metadata may be attached to the terminator instruction of a basic
6854block that's an irreducible loop header (note that an irreducible loop has more
6855than once header basic blocks.) If ``irr_loop`` metadata is attached to the
6856terminator instruction of a basic block that is not really an irreducible loop
6857header, the behavior is undefined. The intent of this metadata is to improve the
6858accuracy of the block frequency propagation. For example, in the code below, the
6859block ``header0`` may have a loop header weight (relative to the other headers of
6860the irreducible loop) of 100:
6861
6862.. code-block:: llvm
6863
6864    header0:
6865    ...
6866    br i1 %cmp, label %t1, label %t2, !irr_loop !0
6867
6868    ...
6869    !0 = !{"loop_header_weight", i64 100}
6870
6871Irreducible loop header weights are typically based on profile data.
6872
6873.. _md_invariant.group:
6874
6875'``invariant.group``' Metadata
6876^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6877
6878The experimental ``invariant.group`` metadata may be attached to
6879``load``/``store`` instructions referencing a single metadata with no entries.
6880The existence of the ``invariant.group`` metadata on the instruction tells
6881the optimizer that every ``load`` and ``store`` to the same pointer operand
6882can be assumed to load or store the same
6883value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
6884when two pointers are considered the same). Pointers returned by bitcast or
6885getelementptr with only zero indices are considered the same.
6886
6887Examples:
6888
6889.. code-block:: llvm
6890
6891   @unknownPtr = external global i8
6892   ...
6893   %ptr = alloca i8
6894   store i8 42, i8* %ptr, !invariant.group !0
6895   call void @foo(i8* %ptr)
6896
6897   %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
6898   call void @foo(i8* %ptr)
6899
6900   %newPtr = call i8* @getPointer(i8* %ptr)
6901   %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
6902
6903   %unknownValue = load i8, i8* @unknownPtr
6904   store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
6905
6906   call void @foo(i8* %ptr)
6907   %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
6908   %d = load i8, i8* %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
6909
6910   ...
6911   declare void @foo(i8*)
6912   declare i8* @getPointer(i8*)
6913   declare i8* @llvm.launder.invariant.group(i8*)
6914
6915   !0 = !{}
6916
6917The invariant.group metadata must be dropped when replacing one pointer by
6918another based on aliasing information. This is because invariant.group is tied
6919to the SSA value of the pointer operand.
6920
6921.. code-block:: llvm
6922
6923  %v = load i8, i8* %x, !invariant.group !0
6924  ; if %x mustalias %y then we can replace the above instruction with
6925  %v = load i8, i8* %y
6926
6927Note that this is an experimental feature, which means that its semantics might
6928change in the future.
6929
6930'``type``' Metadata
6931^^^^^^^^^^^^^^^^^^^
6932
6933See :doc:`TypeMetadata`.
6934
6935'``associated``' Metadata
6936^^^^^^^^^^^^^^^^^^^^^^^^^
6937
6938The ``associated`` metadata may be attached to a global variable definition with
6939a single argument that references a global object (optionally through an alias).
6940
6941This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
6942discarding of the global variable in linker GC unless the referenced object is
6943also discarded. The linker support for this feature is spotty. For best
6944compatibility, globals carrying this metadata should:
6945
6946- Be in ``@llvm.compiler.used``.
6947- If the referenced global variable is in a comdat, be in the same comdat.
6948
6949``!associated`` can not express many-to-one relationship. A global variable with
6950the metadata should generally not be referenced by a function: the function may
6951be inlined into other functions, leading to more references to the metadata.
6952Ideally we would want to keep metadata alive as long as any inline location is
6953alive, but this many-to-one relationship is not representable. Moreover, if the
6954metadata is retained while the function is discarded, the linker will report an
6955error of a relocation referencing a discarded section.
6956
6957The metadata is often used with an explicit section consisting of valid C
6958identifiers so that the runtime can find the metadata section with
6959linker-defined encapsulation symbols ``__start_<section_name>`` and
6960``__stop_<section_name>``.
6961
6962It does not have any effect on non-ELF targets.
6963
6964Example:
6965
6966.. code-block:: text
6967
6968    $a = comdat any
6969    @a = global i32 1, comdat $a
6970    @b = internal global i32 2, comdat $a, section "abc", !associated !0
6971    !0 = !{i32* @a}
6972
6973
6974'``prof``' Metadata
6975^^^^^^^^^^^^^^^^^^^
6976
6977The ``prof`` metadata is used to record profile data in the IR.
6978The first operand of the metadata node indicates the profile metadata
6979type. There are currently 3 types:
6980:ref:`branch_weights<prof_node_branch_weights>`,
6981:ref:`function_entry_count<prof_node_function_entry_count>`, and
6982:ref:`VP<prof_node_VP>`.
6983
6984.. _prof_node_branch_weights:
6985
6986branch_weights
6987""""""""""""""
6988
6989Branch weight metadata attached to a branch, select, switch or call instruction
6990represents the likeliness of the associated branch being taken.
6991For more information, see :doc:`BranchWeightMetadata`.
6992
6993.. _prof_node_function_entry_count:
6994
6995function_entry_count
6996""""""""""""""""""""
6997
6998Function entry count metadata can be attached to function definitions
6999to record the number of times the function is called. Used with BFI
7000information, it is also used to derive the basic block profile count.
7001For more information, see :doc:`BranchWeightMetadata`.
7002
7003.. _prof_node_VP:
7004
7005VP
7006""
7007
7008VP (value profile) metadata can be attached to instructions that have
7009value profile information. Currently this is indirect calls (where it
7010records the hottest callees) and calls to memory intrinsics such as memcpy,
7011memmove, and memset (where it records the hottest byte lengths).
7012
7013Each VP metadata node contains "VP" string, then a uint32_t value for the value
7014profiling kind, a uint64_t value for the total number of times the instruction
7015is executed, followed by uint64_t value and execution count pairs.
7016The value profiling kind is 0 for indirect call targets and 1 for memory
7017operations. For indirect call targets, each profile value is a hash
7018of the callee function name, and for memory operations each value is the
7019byte length.
7020
7021Note that the value counts do not need to add up to the total count
7022listed in the third operand (in practice only the top hottest values
7023are tracked and reported).
7024
7025Indirect call example:
7026
7027.. code-block:: llvm
7028
7029    call void %f(), !prof !1
7030    !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7031
7032Note that the VP type is 0 (the second operand), which indicates this is
7033an indirect call value profile data. The third operand indicates that the
7034indirect call executed 1600 times. The 4th and 6th operands give the
7035hashes of the 2 hottest target functions' names (this is the same hash used
7036to represent function names in the profile database), and the 5th and 7th
7037operands give the execution count that each of the respective prior target
7038functions was called.
7039
7040.. _md_annotation:
7041
7042'``annotation``' Metadata
7043^^^^^^^^^^^^^^^^^^^^^^^^^
7044
7045The ``annotation`` metadata can be used to attach a tuple of annotation strings
7046to any instruction. This metadata does not impact the semantics of the program
7047and may only be used to provide additional insight about the program and
7048transformations to users.
7049
7050Example:
7051
7052.. code-block:: text
7053
7054    %a.addr = alloca float*, align 8, !annotation !0
7055    !0 = !{!"auto-init"}
7056
7057Module Flags Metadata
7058=====================
7059
7060Information about the module as a whole is difficult to convey to LLVM's
7061subsystems. The LLVM IR isn't sufficient to transmit this information.
7062The ``llvm.module.flags`` named metadata exists in order to facilitate
7063this. These flags are in the form of key / value pairs --- much like a
7064dictionary --- making it easy for any subsystem who cares about a flag to
7065look it up.
7066
7067The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7068Each triplet has the following form:
7069
7070-  The first element is a *behavior* flag, which specifies the behavior
7071   when two (or more) modules are merged together, and it encounters two
7072   (or more) metadata with the same ID. The supported behaviors are
7073   described below.
7074-  The second element is a metadata string that is a unique ID for the
7075   metadata. Each module may only have one flag entry for each unique ID (not
7076   including entries with the **Require** behavior).
7077-  The third element is the value of the flag.
7078
7079When two (or more) modules are merged together, the resulting
7080``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7081each unique metadata ID string, there will be exactly one entry in the merged
7082modules ``llvm.module.flags`` metadata table, and the value for that entry will
7083be determined by the merge behavior flag, as described below. The only exception
7084is that entries with the *Require* behavior are always preserved.
7085
7086The following behaviors are supported:
7087
7088.. list-table::
7089   :header-rows: 1
7090   :widths: 10 90
7091
7092   * - Value
7093     - Behavior
7094
7095   * - 1
7096     - **Error**
7097           Emits an error if two values disagree, otherwise the resulting value
7098           is that of the operands.
7099
7100   * - 2
7101     - **Warning**
7102           Emits a warning if two values disagree. The result value will be the
7103           operand for the flag from the first module being linked, or the max
7104           if the other module uses **Max** (in which case the resulting flag
7105           will be **Max**).
7106
7107   * - 3
7108     - **Require**
7109           Adds a requirement that another module flag be present and have a
7110           specified value after linking is performed. The value must be a
7111           metadata pair, where the first element of the pair is the ID of the
7112           module flag to be restricted, and the second element of the pair is
7113           the value the module flag should be restricted to. This behavior can
7114           be used to restrict the allowable results (via triggering of an
7115           error) of linking IDs with the **Override** behavior.
7116
7117   * - 4
7118     - **Override**
7119           Uses the specified value, regardless of the behavior or value of the
7120           other module. If both modules specify **Override**, but the values
7121           differ, an error will be emitted.
7122
7123   * - 5
7124     - **Append**
7125           Appends the two values, which are required to be metadata nodes.
7126
7127   * - 6
7128     - **AppendUnique**
7129           Appends the two values, which are required to be metadata
7130           nodes. However, duplicate entries in the second list are dropped
7131           during the append operation.
7132
7133   * - 7
7134     - **Max**
7135           Takes the max of the two values, which are required to be integers.
7136
7137It is an error for a particular unique flag ID to have multiple behaviors,
7138except in the case of **Require** (which adds restrictions on another metadata
7139value) or **Override**.
7140
7141An example of module flags:
7142
7143.. code-block:: llvm
7144
7145    !0 = !{ i32 1, !"foo", i32 1 }
7146    !1 = !{ i32 4, !"bar", i32 37 }
7147    !2 = !{ i32 2, !"qux", i32 42 }
7148    !3 = !{ i32 3, !"qux",
7149      !{
7150        !"foo", i32 1
7151      }
7152    }
7153    !llvm.module.flags = !{ !0, !1, !2, !3 }
7154
7155-  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7156   if two or more ``!"foo"`` flags are seen is to emit an error if their
7157   values are not equal.
7158
7159-  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7160   behavior if two or more ``!"bar"`` flags are seen is to use the value
7161   '37'.
7162
7163-  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7164   behavior if two or more ``!"qux"`` flags are seen is to emit a
7165   warning if their values are not equal.
7166
7167-  Metadata ``!3`` has the ID ``!"qux"`` and the value:
7168
7169   ::
7170
7171       !{ !"foo", i32 1 }
7172
7173   The behavior is to emit an error if the ``llvm.module.flags`` does not
7174   contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7175   performed.
7176
7177Synthesized Functions Module Flags Metadata
7178-------------------------------------------
7179
7180These metadata specify the default attributes synthesized functions should have.
7181These metadata are currently respected by a few instrumentation passes, such as
7182sanitizers.
7183
7184These metadata correspond to a few function attributes with significant code
7185generation behaviors. Function attributes with just optimization purposes
7186should not be listed because the performance impact of these synthesized
7187functions is small.
7188
7189- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7190  will get the "frame-pointer" function attribute, with value being "none",
7191  "non-leaf", or "all", respectively.
7192- "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized
7193  function will get the ``uwtable`` function attribute.
7194
7195Objective-C Garbage Collection Module Flags Metadata
7196----------------------------------------------------
7197
7198On the Mach-O platform, Objective-C stores metadata about garbage
7199collection in a special section called "image info". The metadata
7200consists of a version number and a bitmask specifying what types of
7201garbage collection are supported (if any) by the file. If two or more
7202modules are linked together their garbage collection metadata needs to
7203be merged rather than appended together.
7204
7205The Objective-C garbage collection module flags metadata consists of the
7206following key-value pairs:
7207
7208.. list-table::
7209   :header-rows: 1
7210   :widths: 30 70
7211
7212   * - Key
7213     - Value
7214
7215   * - ``Objective-C Version``
7216     - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7217
7218   * - ``Objective-C Image Info Version``
7219     - **[Required]** --- The version of the image info section. Currently
7220       always 0.
7221
7222   * - ``Objective-C Image Info Section``
7223     - **[Required]** --- The section to place the metadata. Valid values are
7224       ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7225       ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7226       Objective-C ABI version 2.
7227
7228   * - ``Objective-C Garbage Collection``
7229     - **[Required]** --- Specifies whether garbage collection is supported or
7230       not. Valid values are 0, for no garbage collection, and 2, for garbage
7231       collection supported.
7232
7233   * - ``Objective-C GC Only``
7234     - **[Optional]** --- Specifies that only garbage collection is supported.
7235       If present, its value must be 6. This flag requires that the
7236       ``Objective-C Garbage Collection`` flag have the value 2.
7237
7238Some important flag interactions:
7239
7240-  If a module with ``Objective-C Garbage Collection`` set to 0 is
7241   merged with a module with ``Objective-C Garbage Collection`` set to
7242   2, then the resulting module has the
7243   ``Objective-C Garbage Collection`` flag set to 0.
7244-  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7245   merged with a module with ``Objective-C GC Only`` set to 6.
7246
7247C type width Module Flags Metadata
7248----------------------------------
7249
7250The ARM backend emits a section into each generated object file describing the
7251options that it was compiled with (in a compiler-independent way) to prevent
7252linking incompatible objects, and to allow automatic library selection. Some
7253of these options are not visible at the IR level, namely wchar_t width and enum
7254width.
7255
7256To pass this information to the backend, these options are encoded in module
7257flags metadata, using the following key-value pairs:
7258
7259.. list-table::
7260   :header-rows: 1
7261   :widths: 30 70
7262
7263   * - Key
7264     - Value
7265
7266   * - short_wchar
7267     - * 0 --- sizeof(wchar_t) == 4
7268       * 1 --- sizeof(wchar_t) == 2
7269
7270   * - short_enum
7271     - * 0 --- Enums are at least as large as an ``int``.
7272       * 1 --- Enums are stored in the smallest integer type which can
7273         represent all of its values.
7274
7275For example, the following metadata section specifies that the module was
7276compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7277enum is the smallest type which can represent all of its values::
7278
7279    !llvm.module.flags = !{!0, !1}
7280    !0 = !{i32 1, !"short_wchar", i32 1}
7281    !1 = !{i32 1, !"short_enum", i32 0}
7282
7283LTO Post-Link Module Flags Metadata
7284-----------------------------------
7285
7286Some optimisations are only when the entire LTO unit is present in the current
7287module. This is represented by the ``LTOPostLink`` module flags metadata, which
7288will be created with a value of ``1`` when LTO linking occurs.
7289
7290Automatic Linker Flags Named Metadata
7291=====================================
7292
7293Some targets support embedding of flags to the linker inside individual object
7294files. Typically this is used in conjunction with language extensions which
7295allow source files to contain linker command line options, and have these
7296automatically be transmitted to the linker via object files.
7297
7298These flags are encoded in the IR using named metadata with the name
7299``!llvm.linker.options``. Each operand is expected to be a metadata node
7300which should be a list of other metadata nodes, each of which should be a
7301list of metadata strings defining linker options.
7302
7303For example, the following metadata section specifies two separate sets of
7304linker options, presumably to link against ``libz`` and the ``Cocoa``
7305framework::
7306
7307    !0 = !{ !"-lz" }
7308    !1 = !{ !"-framework", !"Cocoa" }
7309    !llvm.linker.options = !{ !0, !1 }
7310
7311The metadata encoding as lists of lists of options, as opposed to a collapsed
7312list of options, is chosen so that the IR encoding can use multiple option
7313strings to specify e.g., a single library, while still having that specifier be
7314preserved as an atomic element that can be recognized by a target specific
7315assembly writer or object file emitter.
7316
7317Each individual option is required to be either a valid option for the target's
7318linker, or an option that is reserved by the target specific assembly writer or
7319object file emitter. No other aspect of these options is defined by the IR.
7320
7321Dependent Libs Named Metadata
7322=============================
7323
7324Some targets support embedding of strings into object files to indicate
7325a set of libraries to add to the link. Typically this is used in conjunction
7326with language extensions which allow source files to explicitly declare the
7327libraries they depend on, and have these automatically be transmitted to the
7328linker via object files.
7329
7330The list is encoded in the IR using named metadata with the name
7331``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7332which should contain a single string operand.
7333
7334For example, the following metadata section contains two library specifiers::
7335
7336    !0 = !{!"a library specifier"}
7337    !1 = !{!"another library specifier"}
7338    !llvm.dependent-libraries = !{ !0, !1 }
7339
7340Each library specifier will be handled independently by the consuming linker.
7341The effect of the library specifiers are defined by the consuming linker.
7342
7343.. _summary:
7344
7345ThinLTO Summary
7346===============
7347
7348Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7349causes the building of a compact summary of the module that is emitted into
7350the bitcode. The summary is emitted into the LLVM assembly and identified
7351in syntax by a caret ('``^``').
7352
7353The summary is parsed into a bitcode output, along with the Module
7354IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7355of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7356summary entries (just as they currently ignore summary entries in a bitcode
7357input file).
7358
7359Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7360the same conditions where summary index is currently built from bitcode.
7361Specifically, tools that test the Thin Link portion of a ThinLTO compile
7362(i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7363for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7364(this part is not yet implemented, use llvm-as to create a bitcode object
7365before feeding into thin link tools for now).
7366
7367There are currently 3 types of summary entries in the LLVM assembly:
7368:ref:`module paths<module_path_summary>`,
7369:ref:`global values<gv_summary>`, and
7370:ref:`type identifiers<typeid_summary>`.
7371
7372.. _module_path_summary:
7373
7374Module Path Summary Entry
7375-------------------------
7376
7377Each module path summary entry lists a module containing global values included
7378in the summary. For a single IR module there will be one such entry, but
7379in a combined summary index produced during the thin link, there will be
7380one module path entry per linked module with summary.
7381
7382Example:
7383
7384.. code-block:: text
7385
7386    ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7387
7388The ``path`` field is a string path to the bitcode file, and the ``hash``
7389field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7390incremental builds and caching.
7391
7392.. _gv_summary:
7393
7394Global Value Summary Entry
7395--------------------------
7396
7397Each global value summary entry corresponds to a global value defined or
7398referenced by a summarized module.
7399
7400Example:
7401
7402.. code-block:: text
7403
7404    ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7405
7406For declarations, there will not be a summary list. For definitions, a
7407global value will contain a list of summaries, one per module containing
7408a definition. There can be multiple entries in a combined summary index
7409for symbols with weak linkage.
7410
7411Each ``Summary`` format will depend on whether the global value is a
7412:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7413:ref:`alias<alias_summary>`.
7414
7415.. _function_summary:
7416
7417Function Summary
7418^^^^^^^^^^^^^^^^
7419
7420If the global value is a function, the ``Summary`` entry will look like:
7421
7422.. code-block:: text
7423
7424    function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7425
7426The ``module`` field includes the summary entry id for the module containing
7427this definition, and the ``flags`` field contains information such as
7428the linkage type, a flag indicating whether it is legal to import the
7429definition, whether it is globally live and whether the linker resolved it
7430to a local definition (the latter two are populated during the thin link).
7431The ``insts`` field contains the number of IR instructions in the function.
7432Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7433:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7434:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7435
7436.. _variable_summary:
7437
7438Global Variable Summary
7439^^^^^^^^^^^^^^^^^^^^^^^
7440
7441If the global value is a variable, the ``Summary`` entry will look like:
7442
7443.. code-block:: text
7444
7445    variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7446
7447The variable entry contains a subset of the fields in a
7448:ref:`function summary <function_summary>`, see the descriptions there.
7449
7450.. _alias_summary:
7451
7452Alias Summary
7453^^^^^^^^^^^^^
7454
7455If the global value is an alias, the ``Summary`` entry will look like:
7456
7457.. code-block:: text
7458
7459    alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7460
7461The ``module`` and ``flags`` fields are as described for a
7462:ref:`function summary <function_summary>`. The ``aliasee`` field
7463contains a reference to the global value summary entry of the aliasee.
7464
7465.. _funcflags_summary:
7466
7467Function Flags
7468^^^^^^^^^^^^^^
7469
7470The optional ``FuncFlags`` field looks like:
7471
7472.. code-block:: text
7473
7474    funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
7475
7476If unspecified, flags are assumed to hold the conservative ``false`` value of
7477``0``.
7478
7479.. _calls_summary:
7480
7481Calls
7482^^^^^
7483
7484The optional ``Calls`` field looks like:
7485
7486.. code-block:: text
7487
7488    calls: ((Callee)[, (Callee)]*)
7489
7490where each ``Callee`` looks like:
7491
7492.. code-block:: text
7493
7494    callee: ^1[, hotness: None]?[, relbf: 0]?
7495
7496The ``callee`` refers to the summary entry id of the callee. At most one
7497of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
7498``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
7499branch frequency relative to the entry frequency, scaled down by 2^8)
7500may be specified. The defaults are ``Unknown`` and ``0``, respectively.
7501
7502.. _params_summary:
7503
7504Params
7505^^^^^^
7506
7507The optional ``Params`` is used by ``StackSafety`` and looks like:
7508
7509.. code-block:: text
7510
7511    Params: ((Param)[, (Param)]*)
7512
7513where each ``Param`` describes pointer parameter access inside of the
7514function and looks like:
7515
7516.. code-block:: text
7517
7518    param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
7519
7520where the first ``param`` is the number of the parameter it describes,
7521``offset`` is the inclusive range of offsets from the pointer parameter to bytes
7522which can be accessed by the function. This range does not include accesses by
7523function calls from ``calls`` list.
7524
7525where each ``Callee`` describes how parameter is forwarded into other
7526functions and looks like:
7527
7528.. code-block:: text
7529
7530    callee: ^3, param: 5, offset: [-3, 3]
7531
7532The ``callee`` refers to the summary entry id of the callee,  ``param`` is
7533the number of the callee parameter which points into the callers parameter
7534with offset known to be inside of the ``offset`` range. ``calls`` will be
7535consumed and removed by thin link stage to update ``Param::offset`` so it
7536covers all accesses possible by ``calls``.
7537
7538Pointer parameter without corresponding ``Param`` is considered unsafe and we
7539assume that access with any offset is possible.
7540
7541Example:
7542
7543If we have the following function:
7544
7545.. code-block:: text
7546
7547    define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) {
7548      store i32* %1, i32** @x
7549      %5 = getelementptr inbounds i8, i8* %2, i64 5
7550      %6 = load i8, i8* %5
7551      %7 = getelementptr inbounds i8, i8* %2, i8 %3
7552      tail call void @bar(i8 %3, i8* %7)
7553      %8 = load i64, i64* %0
7554      ret i64 %8
7555    }
7556
7557We can expect the record like this:
7558
7559.. code-block:: text
7560
7561    params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
7562
7563The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
7564so the parameter is either not used for function calls or ``offset`` already
7565covers all accesses from nested function calls.
7566Parameter %1 escapes, so access is unknown.
7567The function itself can access just a single byte of the parameter %2. Additional
7568access is possible inside of the ``@bar`` or ``^3``. The function adds signed
7569offset to the pointer and passes the result as the argument %1 into ``^3``.
7570This record itself does not tell us how ``^3`` will access the parameter.
7571Parameter %3 is not a pointer.
7572
7573.. _refs_summary:
7574
7575Refs
7576^^^^
7577
7578The optional ``Refs`` field looks like:
7579
7580.. code-block:: text
7581
7582    refs: ((Ref)[, (Ref)]*)
7583
7584where each ``Ref`` contains a reference to the summary id of the referenced
7585value (e.g. ``^1``).
7586
7587.. _typeidinfo_summary:
7588
7589TypeIdInfo
7590^^^^^^^^^^
7591
7592The optional ``TypeIdInfo`` field, used for
7593`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7594looks like:
7595
7596.. code-block:: text
7597
7598    typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
7599
7600These optional fields have the following forms:
7601
7602TypeTests
7603"""""""""
7604
7605.. code-block:: text
7606
7607    typeTests: (TypeIdRef[, TypeIdRef]*)
7608
7609Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7610by summary id or ``GUID``.
7611
7612TypeTestAssumeVCalls
7613""""""""""""""""""""
7614
7615.. code-block:: text
7616
7617    typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
7618
7619Where each VFuncId has the format:
7620
7621.. code-block:: text
7622
7623    vFuncId: (TypeIdRef, offset: 16)
7624
7625Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7626by summary id or ``GUID`` preceded by a ``guid:`` tag.
7627
7628TypeCheckedLoadVCalls
7629"""""""""""""""""""""
7630
7631.. code-block:: text
7632
7633    typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
7634
7635Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
7636
7637TypeTestAssumeConstVCalls
7638"""""""""""""""""""""""""
7639
7640.. code-block:: text
7641
7642    typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
7643
7644Where each ConstVCall has the format:
7645
7646.. code-block:: text
7647
7648    (VFuncId, args: (Arg[, Arg]*))
7649
7650and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
7651and each Arg is an integer argument number.
7652
7653TypeCheckedLoadConstVCalls
7654""""""""""""""""""""""""""
7655
7656.. code-block:: text
7657
7658    typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
7659
7660Where each ConstVCall has the format described for
7661``TypeTestAssumeConstVCalls``.
7662
7663.. _typeid_summary:
7664
7665Type ID Summary Entry
7666---------------------
7667
7668Each type id summary entry corresponds to a type identifier resolution
7669which is generated during the LTO link portion of the compile when building
7670with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7671so these are only present in a combined summary index.
7672
7673Example:
7674
7675.. code-block:: text
7676
7677    ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
7678
7679The ``typeTestRes`` gives the type test resolution ``kind`` (which may
7680be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
7681the ``size-1`` bit width. It is followed by optional flags, which default to 0,
7682and an optional WpdResolutions (whole program devirtualization resolution)
7683field that looks like:
7684
7685.. code-block:: text
7686
7687    wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
7688
7689where each entry is a mapping from the given byte offset to the whole-program
7690devirtualization resolution WpdRes, that has one of the following formats:
7691
7692.. code-block:: text
7693
7694    wpdRes: (kind: branchFunnel)
7695    wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
7696    wpdRes: (kind: indir)
7697
7698Additionally, each wpdRes has an optional ``resByArg`` field, which
7699describes the resolutions for calls with all constant integer arguments:
7700
7701.. code-block:: text
7702
7703    resByArg: (ResByArg[, ResByArg]*)
7704
7705where ResByArg is:
7706
7707.. code-block:: text
7708
7709    args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
7710
7711Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
7712or ``VirtualConstProp``. The ``info`` field is only used if the kind
7713is ``UniformRetVal`` (indicates the uniform return value), or
7714``UniqueRetVal`` (holds the return value associated with the unique vtable
7715(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
7716not support the use of absolute symbols to store constants.
7717
7718.. _intrinsicglobalvariables:
7719
7720Intrinsic Global Variables
7721==========================
7722
7723LLVM has a number of "magic" global variables that contain data that
7724affect code generation or other IR semantics. These are documented here.
7725All globals of this sort should have a section specified as
7726"``llvm.metadata``". This section and all globals that start with
7727"``llvm.``" are reserved for use by LLVM.
7728
7729.. _gv_llvmused:
7730
7731The '``llvm.used``' Global Variable
7732-----------------------------------
7733
7734The ``@llvm.used`` global is an array which has
7735:ref:`appending linkage <linkage_appending>`. This array contains a list of
7736pointers to named global variables, functions and aliases which may optionally
7737have a pointer cast formed of bitcast or getelementptr. For example, a legal
7738use of it is:
7739
7740.. code-block:: llvm
7741
7742    @X = global i8 4
7743    @Y = global i32 123
7744
7745    @llvm.used = appending global [2 x i8*] [
7746       i8* @X,
7747       i8* bitcast (i32* @Y to i8*)
7748    ], section "llvm.metadata"
7749
7750If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
7751and linker are required to treat the symbol as if there is a reference to the
7752symbol that it cannot see (which is why they have to be named). For example, if
7753a variable has internal linkage and no references other than that from the
7754``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
7755references from inline asms and other things the compiler cannot "see", and
7756corresponds to "``attribute((used))``" in GNU C.
7757
7758On some targets, the code generator must emit a directive to the
7759assembler or object file to prevent the assembler and linker from
7760removing the symbol.
7761
7762.. _gv_llvmcompilerused:
7763
7764The '``llvm.compiler.used``' Global Variable
7765--------------------------------------------
7766
7767The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
7768directive, except that it only prevents the compiler from touching the
7769symbol. On targets that support it, this allows an intelligent linker to
7770optimize references to the symbol without being impeded as it would be
7771by ``@llvm.used``.
7772
7773This is a rare construct that should only be used in rare circumstances,
7774and should not be exposed to source languages.
7775
7776.. _gv_llvmglobalctors:
7777
7778The '``llvm.global_ctors``' Global Variable
7779-------------------------------------------
7780
7781.. code-block:: llvm
7782
7783    %0 = type { i32, void ()*, i8* }
7784    @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
7785
7786The ``@llvm.global_ctors`` array contains a list of constructor
7787functions, priorities, and an associated global or function.
7788The functions referenced by this array will be called in ascending order
7789of priority (i.e. lowest first) when the module is loaded. The order of
7790functions with the same priority is not defined.
7791
7792If the third field is non-null, and points to a global variable
7793or function, the initializer function will only run if the associated
7794data from the current module is not discarded.
7795On ELF the referenced global variable or function must be in a comdat.
7796
7797.. _llvmglobaldtors:
7798
7799The '``llvm.global_dtors``' Global Variable
7800-------------------------------------------
7801
7802.. code-block:: llvm
7803
7804    %0 = type { i32, void ()*, i8* }
7805    @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
7806
7807The ``@llvm.global_dtors`` array contains a list of destructor
7808functions, priorities, and an associated global or function.
7809The functions referenced by this array will be called in descending
7810order of priority (i.e. highest first) when the module is unloaded. The
7811order of functions with the same priority is not defined.
7812
7813If the third field is non-null, and points to a global variable
7814or function, the destructor function will only run if the associated
7815data from the current module is not discarded.
7816On ELF the referenced global variable or function must be in a comdat.
7817
7818Instruction Reference
7819=====================
7820
7821The LLVM instruction set consists of several different classifications
7822of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
7823instructions <binaryops>`, :ref:`bitwise binary
7824instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
7825:ref:`other instructions <otherops>`.
7826
7827.. _terminators:
7828
7829Terminator Instructions
7830-----------------------
7831
7832As mentioned :ref:`previously <functionstructure>`, every basic block in a
7833program ends with a "Terminator" instruction, which indicates which
7834block should be executed after the current block is finished. These
7835terminator instructions typically yield a '``void``' value: they produce
7836control flow, not values (the one exception being the
7837':ref:`invoke <i_invoke>`' instruction).
7838
7839The terminator instructions are: ':ref:`ret <i_ret>`',
7840':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
7841':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
7842':ref:`callbr <i_callbr>`'
7843':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
7844':ref:`catchret <i_catchret>`',
7845':ref:`cleanupret <i_cleanupret>`',
7846and ':ref:`unreachable <i_unreachable>`'.
7847
7848.. _i_ret:
7849
7850'``ret``' Instruction
7851^^^^^^^^^^^^^^^^^^^^^
7852
7853Syntax:
7854"""""""
7855
7856::
7857
7858      ret <type> <value>       ; Return a value from a non-void function
7859      ret void                 ; Return from void function
7860
7861Overview:
7862"""""""""
7863
7864The '``ret``' instruction is used to return control flow (and optionally
7865a value) from a function back to the caller.
7866
7867There are two forms of the '``ret``' instruction: one that returns a
7868value and then causes control flow, and one that just causes control
7869flow to occur.
7870
7871Arguments:
7872""""""""""
7873
7874The '``ret``' instruction optionally accepts a single argument, the
7875return value. The type of the return value must be a ':ref:`first
7876class <t_firstclass>`' type.
7877
7878A function is not :ref:`well formed <wellformed>` if it has a non-void
7879return type and contains a '``ret``' instruction with no return value or
7880a return value with a type that does not match its type, or if it has a
7881void return type and contains a '``ret``' instruction with a return
7882value.
7883
7884Semantics:
7885""""""""""
7886
7887When the '``ret``' instruction is executed, control flow returns back to
7888the calling function's context. If the caller is a
7889":ref:`call <i_call>`" instruction, execution continues at the
7890instruction after the call. If the caller was an
7891":ref:`invoke <i_invoke>`" instruction, execution continues at the
7892beginning of the "normal" destination block. If the instruction returns
7893a value, that value shall set the call or invoke instruction's return
7894value.
7895
7896Example:
7897""""""""
7898
7899.. code-block:: llvm
7900
7901      ret i32 5                       ; Return an integer value of 5
7902      ret void                        ; Return from a void function
7903      ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
7904
7905.. _i_br:
7906
7907'``br``' Instruction
7908^^^^^^^^^^^^^^^^^^^^
7909
7910Syntax:
7911"""""""
7912
7913::
7914
7915      br i1 <cond>, label <iftrue>, label <iffalse>
7916      br label <dest>          ; Unconditional branch
7917
7918Overview:
7919"""""""""
7920
7921The '``br``' instruction is used to cause control flow to transfer to a
7922different basic block in the current function. There are two forms of
7923this instruction, corresponding to a conditional branch and an
7924unconditional branch.
7925
7926Arguments:
7927""""""""""
7928
7929The conditional branch form of the '``br``' instruction takes a single
7930'``i1``' value and two '``label``' values. The unconditional form of the
7931'``br``' instruction takes a single '``label``' value as a target.
7932
7933Semantics:
7934""""""""""
7935
7936Upon execution of a conditional '``br``' instruction, the '``i1``'
7937argument is evaluated. If the value is ``true``, control flows to the
7938'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
7939to the '``iffalse``' ``label`` argument.
7940If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
7941behavior.
7942
7943Example:
7944""""""""
7945
7946.. code-block:: llvm
7947
7948    Test:
7949      %cond = icmp eq i32 %a, %b
7950      br i1 %cond, label %IfEqual, label %IfUnequal
7951    IfEqual:
7952      ret i32 1
7953    IfUnequal:
7954      ret i32 0
7955
7956.. _i_switch:
7957
7958'``switch``' Instruction
7959^^^^^^^^^^^^^^^^^^^^^^^^
7960
7961Syntax:
7962"""""""
7963
7964::
7965
7966      switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
7967
7968Overview:
7969"""""""""
7970
7971The '``switch``' instruction is used to transfer control flow to one of
7972several different places. It is a generalization of the '``br``'
7973instruction, allowing a branch to occur to one of many possible
7974destinations.
7975
7976Arguments:
7977""""""""""
7978
7979The '``switch``' instruction uses three parameters: an integer
7980comparison value '``value``', a default '``label``' destination, and an
7981array of pairs of comparison value constants and '``label``'s. The table
7982is not allowed to contain duplicate constant entries.
7983
7984Semantics:
7985""""""""""
7986
7987The ``switch`` instruction specifies a table of values and destinations.
7988When the '``switch``' instruction is executed, this table is searched
7989for the given value. If the value is found, control flow is transferred
7990to the corresponding destination; otherwise, control flow is transferred
7991to the default destination.
7992If '``value``' is ``poison`` or ``undef``, this instruction has undefined
7993behavior.
7994
7995Implementation:
7996"""""""""""""""
7997
7998Depending on properties of the target machine and the particular
7999``switch`` instruction, this instruction may be code generated in
8000different ways. For example, it could be generated as a series of
8001chained conditional branches or with a lookup table.
8002
8003Example:
8004""""""""
8005
8006.. code-block:: llvm
8007
8008     ; Emulate a conditional br instruction
8009     %Val = zext i1 %value to i32
8010     switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8011
8012     ; Emulate an unconditional br instruction
8013     switch i32 0, label %dest [ ]
8014
8015     ; Implement a jump table:
8016     switch i32 %val, label %otherwise [ i32 0, label %onzero
8017                                         i32 1, label %onone
8018                                         i32 2, label %ontwo ]
8019
8020.. _i_indirectbr:
8021
8022'``indirectbr``' Instruction
8023^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8024
8025Syntax:
8026"""""""
8027
8028::
8029
8030      indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
8031
8032Overview:
8033"""""""""
8034
8035The '``indirectbr``' instruction implements an indirect branch to a
8036label within the current function, whose address is specified by
8037"``address``". Address must be derived from a
8038:ref:`blockaddress <blockaddress>` constant.
8039
8040Arguments:
8041""""""""""
8042
8043The '``address``' argument is the address of the label to jump to. The
8044rest of the arguments indicate the full set of possible destinations
8045that the address may point to. Blocks are allowed to occur multiple
8046times in the destination list, though this isn't particularly useful.
8047
8048This destination list is required so that dataflow analysis has an
8049accurate understanding of the CFG.
8050
8051Semantics:
8052""""""""""
8053
8054Control transfers to the block specified in the address argument. All
8055possible destination blocks must be listed in the label list, otherwise
8056this instruction has undefined behavior. This implies that jumps to
8057labels defined in other functions have undefined behavior as well.
8058If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8059behavior.
8060
8061Implementation:
8062"""""""""""""""
8063
8064This is typically implemented with a jump through a register.
8065
8066Example:
8067""""""""
8068
8069.. code-block:: llvm
8070
8071     indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
8072
8073.. _i_invoke:
8074
8075'``invoke``' Instruction
8076^^^^^^^^^^^^^^^^^^^^^^^^
8077
8078Syntax:
8079"""""""
8080
8081::
8082
8083      <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8084                    [operand bundles] to label <normal label> unwind label <exception label>
8085
8086Overview:
8087"""""""""
8088
8089The '``invoke``' instruction causes control to transfer to a specified
8090function, with the possibility of control flow transfer to either the
8091'``normal``' label or the '``exception``' label. If the callee function
8092returns with the "``ret``" instruction, control flow will return to the
8093"normal" label. If the callee (or any indirect callees) returns via the
8094":ref:`resume <i_resume>`" instruction or other exception handling
8095mechanism, control is interrupted and continued at the dynamically
8096nearest "exception" label.
8097
8098The '``exception``' label is a `landing
8099pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8100'``exception``' label is required to have the
8101":ref:`landingpad <i_landingpad>`" instruction, which contains the
8102information about the behavior of the program after unwinding happens,
8103as its first non-PHI instruction. The restrictions on the
8104"``landingpad``" instruction's tightly couples it to the "``invoke``"
8105instruction, so that the important information contained within the
8106"``landingpad``" instruction can't be lost through normal code motion.
8107
8108Arguments:
8109""""""""""
8110
8111This instruction requires several arguments:
8112
8113#. The optional "cconv" marker indicates which :ref:`calling
8114   convention <callingconv>` the call should use. If none is
8115   specified, the call defaults to using C calling conventions.
8116#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8117   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8118   are valid here.
8119#. The optional addrspace attribute can be used to indicate the address space
8120   of the called function. If it is not specified, the program address space
8121   from the :ref:`datalayout string<langref_datalayout>` will be used.
8122#. '``ty``': the type of the call instruction itself which is also the
8123   type of the return value. Functions that return no value are marked
8124   ``void``.
8125#. '``fnty``': shall be the signature of the function being invoked. The
8126   argument types must match the types implied by this signature. This
8127   type can be omitted if the function is not varargs.
8128#. '``fnptrval``': An LLVM value containing a pointer to a function to
8129   be invoked. In most cases, this is a direct function invocation, but
8130   indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8131   to function value.
8132#. '``function args``': argument list whose types match the function
8133   signature argument types and parameter attributes. All arguments must
8134   be of :ref:`first class <t_firstclass>` type. If the function signature
8135   indicates the function accepts a variable number of arguments, the
8136   extra arguments can be specified.
8137#. '``normal label``': the label reached when the called function
8138   executes a '``ret``' instruction.
8139#. '``exception label``': the label reached when a callee returns via
8140   the :ref:`resume <i_resume>` instruction or other exception handling
8141   mechanism.
8142#. The optional :ref:`function attributes <fnattrs>` list.
8143#. The optional :ref:`operand bundles <opbundles>` list.
8144
8145Semantics:
8146""""""""""
8147
8148This instruction is designed to operate as a standard '``call``'
8149instruction in most regards. The primary difference is that it
8150establishes an association with a label, which is used by the runtime
8151library to unwind the stack.
8152
8153This instruction is used in languages with destructors to ensure that
8154proper cleanup is performed in the case of either a ``longjmp`` or a
8155thrown exception. Additionally, this is important for implementation of
8156'``catch``' clauses in high-level languages that support them.
8157
8158For the purposes of the SSA form, the definition of the value returned
8159by the '``invoke``' instruction is deemed to occur on the edge from the
8160current block to the "normal" label. If the callee unwinds then no
8161return value is available.
8162
8163Example:
8164""""""""
8165
8166.. code-block:: llvm
8167
8168      %retval = invoke i32 @Test(i32 15) to label %Continue
8169                  unwind label %TestCleanup              ; i32:retval set
8170      %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8171                  unwind label %TestCleanup              ; i32:retval set
8172
8173.. _i_callbr:
8174
8175'``callbr``' Instruction
8176^^^^^^^^^^^^^^^^^^^^^^^^
8177
8178Syntax:
8179"""""""
8180
8181::
8182
8183      <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8184                    [operand bundles] to label <fallthrough label> [indirect labels]
8185
8186Overview:
8187"""""""""
8188
8189The '``callbr``' instruction causes control to transfer to a specified
8190function, with the possibility of control flow transfer to either the
8191'``fallthrough``' label or one of the '``indirect``' labels.
8192
8193This instruction should only be used to implement the "goto" feature of gcc
8194style inline assembly. Any other usage is an error in the IR verifier.
8195
8196Arguments:
8197""""""""""
8198
8199This instruction requires several arguments:
8200
8201#. The optional "cconv" marker indicates which :ref:`calling
8202   convention <callingconv>` the call should use. If none is
8203   specified, the call defaults to using C calling conventions.
8204#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8205   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8206   are valid here.
8207#. The optional addrspace attribute can be used to indicate the address space
8208   of the called function. If it is not specified, the program address space
8209   from the :ref:`datalayout string<langref_datalayout>` will be used.
8210#. '``ty``': the type of the call instruction itself which is also the
8211   type of the return value. Functions that return no value are marked
8212   ``void``.
8213#. '``fnty``': shall be the signature of the function being called. The
8214   argument types must match the types implied by this signature. This
8215   type can be omitted if the function is not varargs.
8216#. '``fnptrval``': An LLVM value containing a pointer to a function to
8217   be called. In most cases, this is a direct function call, but
8218   other ``callbr``'s are just as possible, calling an arbitrary pointer
8219   to function value.
8220#. '``function args``': argument list whose types match the function
8221   signature argument types and parameter attributes. All arguments must
8222   be of :ref:`first class <t_firstclass>` type. If the function signature
8223   indicates the function accepts a variable number of arguments, the
8224   extra arguments can be specified.
8225#. '``fallthrough label``': the label reached when the inline assembly's
8226   execution exits the bottom.
8227#. '``indirect labels``': the labels reached when a callee transfers control
8228   to a location other than the '``fallthrough label``'. The blockaddress
8229   constant for these should also be in the list of '``function args``'.
8230#. The optional :ref:`function attributes <fnattrs>` list.
8231#. The optional :ref:`operand bundles <opbundles>` list.
8232
8233Semantics:
8234""""""""""
8235
8236This instruction is designed to operate as a standard '``call``'
8237instruction in most regards. The primary difference is that it
8238establishes an association with additional labels to define where control
8239flow goes after the call.
8240
8241The output values of a '``callbr``' instruction are available only to
8242the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8243
8244The only use of this today is to implement the "goto" feature of gcc inline
8245assembly where additional labels can be provided as locations for the inline
8246assembly to jump to.
8247
8248Example:
8249""""""""
8250
8251.. code-block:: llvm
8252
8253      ; "asm goto" without output constraints.
8254      callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8255                  to label %fallthrough [label %indirect]
8256
8257      ; "asm goto" with output constraints.
8258      <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8259                  to label %fallthrough [label %indirect]
8260
8261.. _i_resume:
8262
8263'``resume``' Instruction
8264^^^^^^^^^^^^^^^^^^^^^^^^
8265
8266Syntax:
8267"""""""
8268
8269::
8270
8271      resume <type> <value>
8272
8273Overview:
8274"""""""""
8275
8276The '``resume``' instruction is a terminator instruction that has no
8277successors.
8278
8279Arguments:
8280""""""""""
8281
8282The '``resume``' instruction requires one argument, which must have the
8283same type as the result of any '``landingpad``' instruction in the same
8284function.
8285
8286Semantics:
8287""""""""""
8288
8289The '``resume``' instruction resumes propagation of an existing
8290(in-flight) exception whose unwinding was interrupted with a
8291:ref:`landingpad <i_landingpad>` instruction.
8292
8293Example:
8294""""""""
8295
8296.. code-block:: llvm
8297
8298      resume { i8*, i32 } %exn
8299
8300.. _i_catchswitch:
8301
8302'``catchswitch``' Instruction
8303^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8304
8305Syntax:
8306"""""""
8307
8308::
8309
8310      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8311      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8312
8313Overview:
8314"""""""""
8315
8316The '``catchswitch``' instruction is used by `LLVM's exception handling system
8317<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8318that may be executed by the :ref:`EH personality routine <personalityfn>`.
8319
8320Arguments:
8321""""""""""
8322
8323The ``parent`` argument is the token of the funclet that contains the
8324``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8325this operand may be the token ``none``.
8326
8327The ``default`` argument is the label of another basic block beginning with
8328either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
8329must be a legal target with respect to the ``parent`` links, as described in
8330the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8331
8332The ``handlers`` are a nonempty list of successor blocks that each begin with a
8333:ref:`catchpad <i_catchpad>` instruction.
8334
8335Semantics:
8336""""""""""
8337
8338Executing this instruction transfers control to one of the successors in
8339``handlers``, if appropriate, or continues to unwind via the unwind label if
8340present.
8341
8342The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8343it must be both the first non-phi instruction and last instruction in the basic
8344block. Therefore, it must be the only non-phi instruction in the block.
8345
8346Example:
8347""""""""
8348
8349.. code-block:: text
8350
8351    dispatch1:
8352      %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8353    dispatch2:
8354      %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8355
8356.. _i_catchret:
8357
8358'``catchret``' Instruction
8359^^^^^^^^^^^^^^^^^^^^^^^^^^
8360
8361Syntax:
8362"""""""
8363
8364::
8365
8366      catchret from <token> to label <normal>
8367
8368Overview:
8369"""""""""
8370
8371The '``catchret``' instruction is a terminator instruction that has a
8372single successor.
8373
8374
8375Arguments:
8376""""""""""
8377
8378The first argument to a '``catchret``' indicates which ``catchpad`` it
8379exits.  It must be a :ref:`catchpad <i_catchpad>`.
8380The second argument to a '``catchret``' specifies where control will
8381transfer to next.
8382
8383Semantics:
8384""""""""""
8385
8386The '``catchret``' instruction ends an existing (in-flight) exception whose
8387unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
8388:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8389code to, for example, destroy the active exception.  Control then transfers to
8390``normal``.
8391
8392The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8393If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8394funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8395the ``catchret``'s behavior is undefined.
8396
8397Example:
8398""""""""
8399
8400.. code-block:: text
8401
8402      catchret from %catch label %continue
8403
8404.. _i_cleanupret:
8405
8406'``cleanupret``' Instruction
8407^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8408
8409Syntax:
8410"""""""
8411
8412::
8413
8414      cleanupret from <value> unwind label <continue>
8415      cleanupret from <value> unwind to caller
8416
8417Overview:
8418"""""""""
8419
8420The '``cleanupret``' instruction is a terminator instruction that has
8421an optional successor.
8422
8423
8424Arguments:
8425""""""""""
8426
8427The '``cleanupret``' instruction requires one argument, which indicates
8428which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8429If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8430funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8431the ``cleanupret``'s behavior is undefined.
8432
8433The '``cleanupret``' instruction also has an optional successor, ``continue``,
8434which must be the label of another basic block beginning with either a
8435``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
8436be a legal target with respect to the ``parent`` links, as described in the
8437`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8438
8439Semantics:
8440""""""""""
8441
8442The '``cleanupret``' instruction indicates to the
8443:ref:`personality function <personalityfn>` that one
8444:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8445It transfers control to ``continue`` or unwinds out of the function.
8446
8447Example:
8448""""""""
8449
8450.. code-block:: text
8451
8452      cleanupret from %cleanup unwind to caller
8453      cleanupret from %cleanup unwind label %continue
8454
8455.. _i_unreachable:
8456
8457'``unreachable``' Instruction
8458^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8459
8460Syntax:
8461"""""""
8462
8463::
8464
8465      unreachable
8466
8467Overview:
8468"""""""""
8469
8470The '``unreachable``' instruction has no defined semantics. This
8471instruction is used to inform the optimizer that a particular portion of
8472the code is not reachable. This can be used to indicate that the code
8473after a no-return function cannot be reached, and other facts.
8474
8475Semantics:
8476""""""""""
8477
8478The '``unreachable``' instruction has no defined semantics.
8479
8480.. _unaryops:
8481
8482Unary Operations
8483-----------------
8484
8485Unary operators require a single operand, execute an operation on
8486it, and produce a single value. The operand might represent multiple
8487data, as is the case with the :ref:`vector <t_vector>` data type. The
8488result value has the same type as its operand.
8489
8490.. _i_fneg:
8491
8492'``fneg``' Instruction
8493^^^^^^^^^^^^^^^^^^^^^^
8494
8495Syntax:
8496"""""""
8497
8498::
8499
8500      <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
8501
8502Overview:
8503"""""""""
8504
8505The '``fneg``' instruction returns the negation of its operand.
8506
8507Arguments:
8508""""""""""
8509
8510The argument to the '``fneg``' instruction must be a
8511:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8512floating-point values.
8513
8514Semantics:
8515""""""""""
8516
8517The value produced is a copy of the operand with its sign bit flipped.
8518This instruction can also take any number of :ref:`fast-math
8519flags <fastmath>`, which are optimization hints to enable otherwise
8520unsafe floating-point optimizations:
8521
8522Example:
8523""""""""
8524
8525.. code-block:: text
8526
8527      <result> = fneg float %val          ; yields float:result = -%var
8528
8529.. _binaryops:
8530
8531Binary Operations
8532-----------------
8533
8534Binary operators are used to do most of the computation in a program.
8535They require two operands of the same type, execute an operation on
8536them, and produce a single value. The operands might represent multiple
8537data, as is the case with the :ref:`vector <t_vector>` data type. The
8538result value has the same type as its operands.
8539
8540There are several different binary operators:
8541
8542.. _i_add:
8543
8544'``add``' Instruction
8545^^^^^^^^^^^^^^^^^^^^^
8546
8547Syntax:
8548"""""""
8549
8550::
8551
8552      <result> = add <ty> <op1>, <op2>          ; yields ty:result
8553      <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
8554      <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
8555      <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8556
8557Overview:
8558"""""""""
8559
8560The '``add``' instruction returns the sum of its two operands.
8561
8562Arguments:
8563""""""""""
8564
8565The two arguments to the '``add``' instruction must be
8566:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8567arguments must have identical types.
8568
8569Semantics:
8570""""""""""
8571
8572The value produced is the integer sum of the two operands.
8573
8574If the sum has unsigned overflow, the result returned is the
8575mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8576the result.
8577
8578Because LLVM integers use a two's complement representation, this
8579instruction is appropriate for both signed and unsigned integers.
8580
8581``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8582respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8583result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
8584unsigned and/or signed overflow, respectively, occurs.
8585
8586Example:
8587""""""""
8588
8589.. code-block:: text
8590
8591      <result> = add i32 4, %var          ; yields i32:result = 4 + %var
8592
8593.. _i_fadd:
8594
8595'``fadd``' Instruction
8596^^^^^^^^^^^^^^^^^^^^^^
8597
8598Syntax:
8599"""""""
8600
8601::
8602
8603      <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8604
8605Overview:
8606"""""""""
8607
8608The '``fadd``' instruction returns the sum of its two operands.
8609
8610Arguments:
8611""""""""""
8612
8613The two arguments to the '``fadd``' instruction must be
8614:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8615floating-point values. Both arguments must have identical types.
8616
8617Semantics:
8618""""""""""
8619
8620The value produced is the floating-point sum of the two operands.
8621This instruction is assumed to execute in the default :ref:`floating-point
8622environment <floatenv>`.
8623This instruction can also take any number of :ref:`fast-math
8624flags <fastmath>`, which are optimization hints to enable otherwise
8625unsafe floating-point optimizations:
8626
8627Example:
8628""""""""
8629
8630.. code-block:: text
8631
8632      <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
8633
8634.. _i_sub:
8635
8636'``sub``' Instruction
8637^^^^^^^^^^^^^^^^^^^^^
8638
8639Syntax:
8640"""""""
8641
8642::
8643
8644      <result> = sub <ty> <op1>, <op2>          ; yields ty:result
8645      <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
8646      <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
8647      <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8648
8649Overview:
8650"""""""""
8651
8652The '``sub``' instruction returns the difference of its two operands.
8653
8654Note that the '``sub``' instruction is used to represent the '``neg``'
8655instruction present in most other intermediate representations.
8656
8657Arguments:
8658""""""""""
8659
8660The two arguments to the '``sub``' instruction must be
8661:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8662arguments must have identical types.
8663
8664Semantics:
8665""""""""""
8666
8667The value produced is the integer difference of the two operands.
8668
8669If the difference has unsigned overflow, the result returned is the
8670mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8671the result.
8672
8673Because LLVM integers use a two's complement representation, this
8674instruction is appropriate for both signed and unsigned integers.
8675
8676``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8677respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8678result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
8679unsigned and/or signed overflow, respectively, occurs.
8680
8681Example:
8682""""""""
8683
8684.. code-block:: text
8685
8686      <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
8687      <result> = sub i32 0, %val          ; yields i32:result = -%var
8688
8689.. _i_fsub:
8690
8691'``fsub``' Instruction
8692^^^^^^^^^^^^^^^^^^^^^^
8693
8694Syntax:
8695"""""""
8696
8697::
8698
8699      <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8700
8701Overview:
8702"""""""""
8703
8704The '``fsub``' instruction returns the difference of its two operands.
8705
8706Arguments:
8707""""""""""
8708
8709The two arguments to the '``fsub``' instruction must be
8710:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8711floating-point values. Both arguments must have identical types.
8712
8713Semantics:
8714""""""""""
8715
8716The value produced is the floating-point difference of the two operands.
8717This instruction is assumed to execute in the default :ref:`floating-point
8718environment <floatenv>`.
8719This instruction can also take any number of :ref:`fast-math
8720flags <fastmath>`, which are optimization hints to enable otherwise
8721unsafe floating-point optimizations:
8722
8723Example:
8724""""""""
8725
8726.. code-block:: text
8727
8728      <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
8729      <result> = fsub float -0.0, %val          ; yields float:result = -%var
8730
8731.. _i_mul:
8732
8733'``mul``' Instruction
8734^^^^^^^^^^^^^^^^^^^^^
8735
8736Syntax:
8737"""""""
8738
8739::
8740
8741      <result> = mul <ty> <op1>, <op2>          ; yields ty:result
8742      <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
8743      <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
8744      <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8745
8746Overview:
8747"""""""""
8748
8749The '``mul``' instruction returns the product of its two operands.
8750
8751Arguments:
8752""""""""""
8753
8754The two arguments to the '``mul``' instruction must be
8755:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8756arguments must have identical types.
8757
8758Semantics:
8759""""""""""
8760
8761The value produced is the integer product of the two operands.
8762
8763If the result of the multiplication has unsigned overflow, the result
8764returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
8765bit width of the result.
8766
8767Because LLVM integers use a two's complement representation, and the
8768result is the same width as the operands, this instruction returns the
8769correct result for both signed and unsigned integers. If a full product
8770(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
8771sign-extended or zero-extended as appropriate to the width of the full
8772product.
8773
8774``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8775respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8776result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
8777unsigned and/or signed overflow, respectively, occurs.
8778
8779Example:
8780""""""""
8781
8782.. code-block:: text
8783
8784      <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
8785
8786.. _i_fmul:
8787
8788'``fmul``' Instruction
8789^^^^^^^^^^^^^^^^^^^^^^
8790
8791Syntax:
8792"""""""
8793
8794::
8795
8796      <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8797
8798Overview:
8799"""""""""
8800
8801The '``fmul``' instruction returns the product of its two operands.
8802
8803Arguments:
8804""""""""""
8805
8806The two arguments to the '``fmul``' instruction must be
8807:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8808floating-point values. Both arguments must have identical types.
8809
8810Semantics:
8811""""""""""
8812
8813The value produced is the floating-point product of the two operands.
8814This instruction is assumed to execute in the default :ref:`floating-point
8815environment <floatenv>`.
8816This instruction can also take any number of :ref:`fast-math
8817flags <fastmath>`, which are optimization hints to enable otherwise
8818unsafe floating-point optimizations:
8819
8820Example:
8821""""""""
8822
8823.. code-block:: text
8824
8825      <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
8826
8827.. _i_udiv:
8828
8829'``udiv``' Instruction
8830^^^^^^^^^^^^^^^^^^^^^^
8831
8832Syntax:
8833"""""""
8834
8835::
8836
8837      <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
8838      <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
8839
8840Overview:
8841"""""""""
8842
8843The '``udiv``' instruction returns the quotient of its two operands.
8844
8845Arguments:
8846""""""""""
8847
8848The two arguments to the '``udiv``' instruction must be
8849:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8850arguments must have identical types.
8851
8852Semantics:
8853""""""""""
8854
8855The value produced is the unsigned integer quotient of the two operands.
8856
8857Note that unsigned integer division and signed integer division are
8858distinct operations; for signed integer division, use '``sdiv``'.
8859
8860Division by zero is undefined behavior. For vectors, if any element
8861of the divisor is zero, the operation has undefined behavior.
8862
8863
8864If the ``exact`` keyword is present, the result value of the ``udiv`` is
8865a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
8866such, "((a udiv exact b) mul b) == a").
8867
8868Example:
8869""""""""
8870
8871.. code-block:: text
8872
8873      <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
8874
8875.. _i_sdiv:
8876
8877'``sdiv``' Instruction
8878^^^^^^^^^^^^^^^^^^^^^^
8879
8880Syntax:
8881"""""""
8882
8883::
8884
8885      <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
8886      <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
8887
8888Overview:
8889"""""""""
8890
8891The '``sdiv``' instruction returns the quotient of its two operands.
8892
8893Arguments:
8894""""""""""
8895
8896The two arguments to the '``sdiv``' instruction must be
8897:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8898arguments must have identical types.
8899
8900Semantics:
8901""""""""""
8902
8903The value produced is the signed integer quotient of the two operands
8904rounded towards zero.
8905
8906Note that signed integer division and unsigned integer division are
8907distinct operations; for unsigned integer division, use '``udiv``'.
8908
8909Division by zero is undefined behavior. For vectors, if any element
8910of the divisor is zero, the operation has undefined behavior.
8911Overflow also leads to undefined behavior; this is a rare case, but can
8912occur, for example, by doing a 32-bit division of -2147483648 by -1.
8913
8914If the ``exact`` keyword is present, the result value of the ``sdiv`` is
8915a :ref:`poison value <poisonvalues>` if the result would be rounded.
8916
8917Example:
8918""""""""
8919
8920.. code-block:: text
8921
8922      <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
8923
8924.. _i_fdiv:
8925
8926'``fdiv``' Instruction
8927^^^^^^^^^^^^^^^^^^^^^^
8928
8929Syntax:
8930"""""""
8931
8932::
8933
8934      <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8935
8936Overview:
8937"""""""""
8938
8939The '``fdiv``' instruction returns the quotient of its two operands.
8940
8941Arguments:
8942""""""""""
8943
8944The two arguments to the '``fdiv``' instruction must be
8945:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8946floating-point values. Both arguments must have identical types.
8947
8948Semantics:
8949""""""""""
8950
8951The value produced is the floating-point quotient of the two operands.
8952This instruction is assumed to execute in the default :ref:`floating-point
8953environment <floatenv>`.
8954This instruction can also take any number of :ref:`fast-math
8955flags <fastmath>`, which are optimization hints to enable otherwise
8956unsafe floating-point optimizations:
8957
8958Example:
8959""""""""
8960
8961.. code-block:: text
8962
8963      <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
8964
8965.. _i_urem:
8966
8967'``urem``' Instruction
8968^^^^^^^^^^^^^^^^^^^^^^
8969
8970Syntax:
8971"""""""
8972
8973::
8974
8975      <result> = urem <ty> <op1>, <op2>   ; yields ty:result
8976
8977Overview:
8978"""""""""
8979
8980The '``urem``' instruction returns the remainder from the unsigned
8981division of its two arguments.
8982
8983Arguments:
8984""""""""""
8985
8986The two arguments to the '``urem``' instruction must be
8987:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8988arguments must have identical types.
8989
8990Semantics:
8991""""""""""
8992
8993This instruction returns the unsigned integer *remainder* of a division.
8994This instruction always performs an unsigned division to get the
8995remainder.
8996
8997Note that unsigned integer remainder and signed integer remainder are
8998distinct operations; for signed integer remainder, use '``srem``'.
8999
9000Taking the remainder of a division by zero is undefined behavior.
9001For vectors, if any element of the divisor is zero, the operation has
9002undefined behavior.
9003
9004Example:
9005""""""""
9006
9007.. code-block:: text
9008
9009      <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
9010
9011.. _i_srem:
9012
9013'``srem``' Instruction
9014^^^^^^^^^^^^^^^^^^^^^^
9015
9016Syntax:
9017"""""""
9018
9019::
9020
9021      <result> = srem <ty> <op1>, <op2>   ; yields ty:result
9022
9023Overview:
9024"""""""""
9025
9026The '``srem``' instruction returns the remainder from the signed
9027division of its two operands. This instruction can also take
9028:ref:`vector <t_vector>` versions of the values in which case the elements
9029must be integers.
9030
9031Arguments:
9032""""""""""
9033
9034The two arguments to the '``srem``' instruction must be
9035:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9036arguments must have identical types.
9037
9038Semantics:
9039""""""""""
9040
9041This instruction returns the *remainder* of a division (where the result
9042is either zero or has the same sign as the dividend, ``op1``), not the
9043*modulo* operator (where the result is either zero or has the same sign
9044as the divisor, ``op2``) of a value. For more information about the
9045difference, see `The Math
9046Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9047table of how this is implemented in various languages, please see
9048`Wikipedia: modulo
9049operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9050
9051Note that signed integer remainder and unsigned integer remainder are
9052distinct operations; for unsigned integer remainder, use '``urem``'.
9053
9054Taking the remainder of a division by zero is undefined behavior.
9055For vectors, if any element of the divisor is zero, the operation has
9056undefined behavior.
9057Overflow also leads to undefined behavior; this is a rare case, but can
9058occur, for example, by taking the remainder of a 32-bit division of
9059-2147483648 by -1. (The remainder doesn't actually overflow, but this
9060rule lets srem be implemented using instructions that return both the
9061result of the division and the remainder.)
9062
9063Example:
9064""""""""
9065
9066.. code-block:: text
9067
9068      <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
9069
9070.. _i_frem:
9071
9072'``frem``' Instruction
9073^^^^^^^^^^^^^^^^^^^^^^
9074
9075Syntax:
9076"""""""
9077
9078::
9079
9080      <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9081
9082Overview:
9083"""""""""
9084
9085The '``frem``' instruction returns the remainder from the division of
9086its two operands.
9087
9088Arguments:
9089""""""""""
9090
9091The two arguments to the '``frem``' instruction must be
9092:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9093floating-point values. Both arguments must have identical types.
9094
9095Semantics:
9096""""""""""
9097
9098The value produced is the floating-point remainder of the two operands.
9099This is the same output as a libm '``fmod``' function, but without any
9100possibility of setting ``errno``. The remainder has the same sign as the
9101dividend.
9102This instruction is assumed to execute in the default :ref:`floating-point
9103environment <floatenv>`.
9104This instruction can also take any number of :ref:`fast-math
9105flags <fastmath>`, which are optimization hints to enable otherwise
9106unsafe floating-point optimizations:
9107
9108Example:
9109""""""""
9110
9111.. code-block:: text
9112
9113      <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
9114
9115.. _bitwiseops:
9116
9117Bitwise Binary Operations
9118-------------------------
9119
9120Bitwise binary operators are used to do various forms of bit-twiddling
9121in a program. They are generally very efficient instructions and can
9122commonly be strength reduced from other instructions. They require two
9123operands of the same type, execute an operation on them, and produce a
9124single value. The resulting value is the same type as its operands.
9125
9126.. _i_shl:
9127
9128'``shl``' Instruction
9129^^^^^^^^^^^^^^^^^^^^^
9130
9131Syntax:
9132"""""""
9133
9134::
9135
9136      <result> = shl <ty> <op1>, <op2>           ; yields ty:result
9137      <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
9138      <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
9139      <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
9140
9141Overview:
9142"""""""""
9143
9144The '``shl``' instruction returns the first operand shifted to the left
9145a specified number of bits.
9146
9147Arguments:
9148""""""""""
9149
9150Both arguments to the '``shl``' instruction must be the same
9151:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9152'``op2``' is treated as an unsigned value.
9153
9154Semantics:
9155""""""""""
9156
9157The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9158where ``n`` is the width of the result. If ``op2`` is (statically or
9159dynamically) equal to or larger than the number of bits in
9160``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9161If the arguments are vectors, each vector element of ``op1`` is shifted
9162by the corresponding shift amount in ``op2``.
9163
9164If the ``nuw`` keyword is present, then the shift produces a poison
9165value if it shifts out any non-zero bits.
9166If the ``nsw`` keyword is present, then the shift produces a poison
9167value if it shifts out any bits that disagree with the resultant sign bit.
9168
9169Example:
9170""""""""
9171
9172.. code-block:: text
9173
9174      <result> = shl i32 4, %var   ; yields i32: 4 << %var
9175      <result> = shl i32 4, 2      ; yields i32: 16
9176      <result> = shl i32 1, 10     ; yields i32: 1024
9177      <result> = shl i32 1, 32     ; undefined
9178      <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
9179
9180.. _i_lshr:
9181
9182
9183'``lshr``' Instruction
9184^^^^^^^^^^^^^^^^^^^^^^
9185
9186Syntax:
9187"""""""
9188
9189::
9190
9191      <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
9192      <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
9193
9194Overview:
9195"""""""""
9196
9197The '``lshr``' instruction (logical shift right) returns the first
9198operand shifted to the right a specified number of bits with zero fill.
9199
9200Arguments:
9201""""""""""
9202
9203Both arguments to the '``lshr``' instruction must be the same
9204:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9205'``op2``' is treated as an unsigned value.
9206
9207Semantics:
9208""""""""""
9209
9210This instruction always performs a logical shift right operation. The
9211most significant bits of the result will be filled with zero bits after
9212the shift. If ``op2`` is (statically or dynamically) equal to or larger
9213than the number of bits in ``op1``, this instruction returns a :ref:`poison
9214value <poisonvalues>`. If the arguments are vectors, each vector element
9215of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9216
9217If the ``exact`` keyword is present, the result value of the ``lshr`` is
9218a poison value if any of the bits shifted out are non-zero.
9219
9220Example:
9221""""""""
9222
9223.. code-block:: text
9224
9225      <result> = lshr i32 4, 1   ; yields i32:result = 2
9226      <result> = lshr i32 4, 2   ; yields i32:result = 1
9227      <result> = lshr i8  4, 3   ; yields i8:result = 0
9228      <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
9229      <result> = lshr i32 1, 32  ; undefined
9230      <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9231
9232.. _i_ashr:
9233
9234'``ashr``' Instruction
9235^^^^^^^^^^^^^^^^^^^^^^
9236
9237Syntax:
9238"""""""
9239
9240::
9241
9242      <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
9243      <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
9244
9245Overview:
9246"""""""""
9247
9248The '``ashr``' instruction (arithmetic shift right) returns the first
9249operand shifted to the right a specified number of bits with sign
9250extension.
9251
9252Arguments:
9253""""""""""
9254
9255Both arguments to the '``ashr``' instruction must be the same
9256:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9257'``op2``' is treated as an unsigned value.
9258
9259Semantics:
9260""""""""""
9261
9262This instruction always performs an arithmetic shift right operation,
9263The most significant bits of the result will be filled with the sign bit
9264of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9265than the number of bits in ``op1``, this instruction returns a :ref:`poison
9266value <poisonvalues>`. If the arguments are vectors, each vector element
9267of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9268
9269If the ``exact`` keyword is present, the result value of the ``ashr`` is
9270a poison value if any of the bits shifted out are non-zero.
9271
9272Example:
9273""""""""
9274
9275.. code-block:: text
9276
9277      <result> = ashr i32 4, 1   ; yields i32:result = 2
9278      <result> = ashr i32 4, 2   ; yields i32:result = 1
9279      <result> = ashr i8  4, 3   ; yields i8:result = 0
9280      <result> = ashr i8 -2, 1   ; yields i8:result = -1
9281      <result> = ashr i32 1, 32  ; undefined
9282      <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
9283
9284.. _i_and:
9285
9286'``and``' Instruction
9287^^^^^^^^^^^^^^^^^^^^^
9288
9289Syntax:
9290"""""""
9291
9292::
9293
9294      <result> = and <ty> <op1>, <op2>   ; yields ty:result
9295
9296Overview:
9297"""""""""
9298
9299The '``and``' instruction returns the bitwise logical and of its two
9300operands.
9301
9302Arguments:
9303""""""""""
9304
9305The two arguments to the '``and``' instruction must be
9306:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9307arguments must have identical types.
9308
9309Semantics:
9310""""""""""
9311
9312The truth table used for the '``and``' instruction is:
9313
9314+-----+-----+-----+
9315| In0 | In1 | Out |
9316+-----+-----+-----+
9317|   0 |   0 |   0 |
9318+-----+-----+-----+
9319|   0 |   1 |   0 |
9320+-----+-----+-----+
9321|   1 |   0 |   0 |
9322+-----+-----+-----+
9323|   1 |   1 |   1 |
9324+-----+-----+-----+
9325
9326Example:
9327""""""""
9328
9329.. code-block:: text
9330
9331      <result> = and i32 4, %var         ; yields i32:result = 4 & %var
9332      <result> = and i32 15, 40          ; yields i32:result = 8
9333      <result> = and i32 4, 8            ; yields i32:result = 0
9334
9335.. _i_or:
9336
9337'``or``' Instruction
9338^^^^^^^^^^^^^^^^^^^^
9339
9340Syntax:
9341"""""""
9342
9343::
9344
9345      <result> = or <ty> <op1>, <op2>   ; yields ty:result
9346
9347Overview:
9348"""""""""
9349
9350The '``or``' instruction returns the bitwise logical inclusive or of its
9351two operands.
9352
9353Arguments:
9354""""""""""
9355
9356The two arguments to the '``or``' instruction must be
9357:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9358arguments must have identical types.
9359
9360Semantics:
9361""""""""""
9362
9363The truth table used for the '``or``' instruction is:
9364
9365+-----+-----+-----+
9366| In0 | In1 | Out |
9367+-----+-----+-----+
9368|   0 |   0 |   0 |
9369+-----+-----+-----+
9370|   0 |   1 |   1 |
9371+-----+-----+-----+
9372|   1 |   0 |   1 |
9373+-----+-----+-----+
9374|   1 |   1 |   1 |
9375+-----+-----+-----+
9376
9377Example:
9378""""""""
9379
9380::
9381
9382      <result> = or i32 4, %var         ; yields i32:result = 4 | %var
9383      <result> = or i32 15, 40          ; yields i32:result = 47
9384      <result> = or i32 4, 8            ; yields i32:result = 12
9385
9386.. _i_xor:
9387
9388'``xor``' Instruction
9389^^^^^^^^^^^^^^^^^^^^^
9390
9391Syntax:
9392"""""""
9393
9394::
9395
9396      <result> = xor <ty> <op1>, <op2>   ; yields ty:result
9397
9398Overview:
9399"""""""""
9400
9401The '``xor``' instruction returns the bitwise logical exclusive or of
9402its two operands. The ``xor`` is used to implement the "one's
9403complement" operation, which is the "~" operator in C.
9404
9405Arguments:
9406""""""""""
9407
9408The two arguments to the '``xor``' instruction must be
9409:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9410arguments must have identical types.
9411
9412Semantics:
9413""""""""""
9414
9415The truth table used for the '``xor``' instruction is:
9416
9417+-----+-----+-----+
9418| In0 | In1 | Out |
9419+-----+-----+-----+
9420|   0 |   0 |   0 |
9421+-----+-----+-----+
9422|   0 |   1 |   1 |
9423+-----+-----+-----+
9424|   1 |   0 |   1 |
9425+-----+-----+-----+
9426|   1 |   1 |   0 |
9427+-----+-----+-----+
9428
9429Example:
9430""""""""
9431
9432.. code-block:: text
9433
9434      <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
9435      <result> = xor i32 15, 40          ; yields i32:result = 39
9436      <result> = xor i32 4, 8            ; yields i32:result = 12
9437      <result> = xor i32 %V, -1          ; yields i32:result = ~%V
9438
9439Vector Operations
9440-----------------
9441
9442LLVM supports several instructions to represent vector operations in a
9443target-independent manner. These instructions cover the element-access
9444and vector-specific operations needed to process vectors effectively.
9445While LLVM does directly support these vector operations, many
9446sophisticated algorithms will want to use target-specific intrinsics to
9447take full advantage of a specific target.
9448
9449.. _i_extractelement:
9450
9451'``extractelement``' Instruction
9452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9453
9454Syntax:
9455"""""""
9456
9457::
9458
9459      <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
9460      <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
9461
9462Overview:
9463"""""""""
9464
9465The '``extractelement``' instruction extracts a single scalar element
9466from a vector at a specified index.
9467
9468Arguments:
9469""""""""""
9470
9471The first operand of an '``extractelement``' instruction is a value of
9472:ref:`vector <t_vector>` type. The second operand is an index indicating
9473the position from which to extract the element. The index may be a
9474variable of any integer type.
9475
9476Semantics:
9477""""""""""
9478
9479The result is a scalar of the same type as the element type of ``val``.
9480Its value is the value at position ``idx`` of ``val``. If ``idx``
9481exceeds the length of ``val`` for a fixed-length vector, the result is a
9482:ref:`poison value <poisonvalues>`. For a scalable vector, if the value
9483of ``idx`` exceeds the runtime length of the vector, the result is a
9484:ref:`poison value <poisonvalues>`.
9485
9486Example:
9487""""""""
9488
9489.. code-block:: text
9490
9491      <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
9492
9493.. _i_insertelement:
9494
9495'``insertelement``' Instruction
9496^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9497
9498Syntax:
9499"""""""
9500
9501::
9502
9503      <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
9504      <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
9505
9506Overview:
9507"""""""""
9508
9509The '``insertelement``' instruction inserts a scalar element into a
9510vector at a specified index.
9511
9512Arguments:
9513""""""""""
9514
9515The first operand of an '``insertelement``' instruction is a value of
9516:ref:`vector <t_vector>` type. The second operand is a scalar value whose
9517type must equal the element type of the first operand. The third operand
9518is an index indicating the position at which to insert the value. The
9519index may be a variable of any integer type.
9520
9521Semantics:
9522""""""""""
9523
9524The result is a vector of the same type as ``val``. Its element values
9525are those of ``val`` except at position ``idx``, where it gets the value
9526``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
9527the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
9528if the value of ``idx`` exceeds the runtime length of the vector, the result
9529is a :ref:`poison value <poisonvalues>`.
9530
9531Example:
9532""""""""
9533
9534.. code-block:: text
9535
9536      <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
9537
9538.. _i_shufflevector:
9539
9540'``shufflevector``' Instruction
9541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9542
9543Syntax:
9544"""""""
9545
9546::
9547
9548      <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
9549      <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
9550
9551Overview:
9552"""""""""
9553
9554The '``shufflevector``' instruction constructs a permutation of elements
9555from two input vectors, returning a vector with the same element type as
9556the input and length that is the same as the shuffle mask.
9557
9558Arguments:
9559""""""""""
9560
9561The first two operands of a '``shufflevector``' instruction are vectors
9562with the same type. The third argument is a shuffle mask vector constant
9563whose element type is ``i32``. The mask vector elements must be constant
9564integers or ``undef`` values. The result of the instruction is a vector
9565whose length is the same as the shuffle mask and whose element type is the
9566same as the element type of the first two operands.
9567
9568Semantics:
9569""""""""""
9570
9571The elements of the two input vectors are numbered from left to right
9572across both of the vectors. For each element of the result vector, the
9573shuffle mask selects an element from one of the input vectors to copy
9574to the result. Non-negative elements in the mask represent an index
9575into the concatenated pair of input vectors.
9576
9577If the shuffle mask is undefined, the result vector is undefined. If
9578the shuffle mask selects an undefined element from one of the input
9579vectors, the resulting element is undefined. An undefined element
9580in the mask vector specifies that the resulting element is undefined.
9581An undefined element in the mask vector prevents a poisoned vector
9582element from propagating.
9583
9584For scalable vectors, the only valid mask values at present are
9585``zeroinitializer`` and ``undef``, since we cannot write all indices as
9586literals for a vector with a length unknown at compile time.
9587
9588Example:
9589""""""""
9590
9591.. code-block:: text
9592
9593      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9594                              <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
9595      <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
9596                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
9597      <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
9598                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
9599      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9600                              <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
9601
9602Aggregate Operations
9603--------------------
9604
9605LLVM supports several instructions for working with
9606:ref:`aggregate <t_aggregate>` values.
9607
9608.. _i_extractvalue:
9609
9610'``extractvalue``' Instruction
9611^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9612
9613Syntax:
9614"""""""
9615
9616::
9617
9618      <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
9619
9620Overview:
9621"""""""""
9622
9623The '``extractvalue``' instruction extracts the value of a member field
9624from an :ref:`aggregate <t_aggregate>` value.
9625
9626Arguments:
9627""""""""""
9628
9629The first operand of an '``extractvalue``' instruction is a value of
9630:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
9631constant indices to specify which value to extract in a similar manner
9632as indices in a '``getelementptr``' instruction.
9633
9634The major differences to ``getelementptr`` indexing are:
9635
9636-  Since the value being indexed is not a pointer, the first index is
9637   omitted and assumed to be zero.
9638-  At least one index must be specified.
9639-  Not only struct indices but also array indices must be in bounds.
9640
9641Semantics:
9642""""""""""
9643
9644The result is the value at the position in the aggregate specified by
9645the index operands.
9646
9647Example:
9648""""""""
9649
9650.. code-block:: text
9651
9652      <result> = extractvalue {i32, float} %agg, 0    ; yields i32
9653
9654.. _i_insertvalue:
9655
9656'``insertvalue``' Instruction
9657^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9658
9659Syntax:
9660"""""""
9661
9662::
9663
9664      <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
9665
9666Overview:
9667"""""""""
9668
9669The '``insertvalue``' instruction inserts a value into a member field in
9670an :ref:`aggregate <t_aggregate>` value.
9671
9672Arguments:
9673""""""""""
9674
9675The first operand of an '``insertvalue``' instruction is a value of
9676:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
9677a first-class value to insert. The following operands are constant
9678indices indicating the position at which to insert the value in a
9679similar manner as indices in a '``extractvalue``' instruction. The value
9680to insert must have the same type as the value identified by the
9681indices.
9682
9683Semantics:
9684""""""""""
9685
9686The result is an aggregate of the same type as ``val``. Its value is
9687that of ``val`` except that the value at the position specified by the
9688indices is that of ``elt``.
9689
9690Example:
9691""""""""
9692
9693.. code-block:: llvm
9694
9695      %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
9696      %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
9697      %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
9698
9699.. _memoryops:
9700
9701Memory Access and Addressing Operations
9702---------------------------------------
9703
9704A key design point of an SSA-based representation is how it represents
9705memory. In LLVM, no memory locations are in SSA form, which makes things
9706very simple. This section describes how to read, write, and allocate
9707memory in LLVM.
9708
9709.. _i_alloca:
9710
9711'``alloca``' Instruction
9712^^^^^^^^^^^^^^^^^^^^^^^^
9713
9714Syntax:
9715"""""""
9716
9717::
9718
9719      <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
9720
9721Overview:
9722"""""""""
9723
9724The '``alloca``' instruction allocates memory on the stack frame of the
9725currently executing function, to be automatically released when this
9726function returns to its caller.  If the address space is not explicitly
9727specified, the object is allocated in the alloca address space from the
9728:ref:`datalayout string<langref_datalayout>`.
9729
9730Arguments:
9731""""""""""
9732
9733The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
9734bytes of memory on the runtime stack, returning a pointer of the
9735appropriate type to the program. If "NumElements" is specified, it is
9736the number of elements allocated, otherwise "NumElements" is defaulted
9737to be one. If a constant alignment is specified, the value result of the
9738allocation is guaranteed to be aligned to at least that boundary. The
9739alignment may not be greater than ``1 << 32``. If not specified, or if
9740zero, the target can choose to align the allocation on any convenient
9741boundary compatible with the type.
9742
9743'``type``' may be any sized type.
9744
9745Semantics:
9746""""""""""
9747
9748Memory is allocated; a pointer is returned. The allocated memory is
9749uninitialized, and loading from uninitialized memory produces an undefined
9750value. The operation itself is undefined if there is insufficient stack
9751space for the allocation.'``alloca``'d memory is automatically released
9752when the function returns. The '``alloca``' instruction is commonly used
9753to represent automatic variables that must have an address available. When
9754the function returns (either with the ``ret`` or ``resume`` instructions),
9755the memory is reclaimed. Allocating zero bytes is legal, but the returned
9756pointer may not be unique. The order in which memory is allocated (ie.,
9757which way the stack grows) is not specified.
9758
9759Note that '``alloca``' outside of the alloca address space from the
9760:ref:`datalayout string<langref_datalayout>` is meaningful only if the
9761target has assigned it a semantics.
9762
9763If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
9764the returned object is initially dead.
9765See :ref:`llvm.lifetime.start <int_lifestart>` and
9766:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
9767lifetime-manipulating intrinsics.
9768
9769Example:
9770""""""""
9771
9772.. code-block:: llvm
9773
9774      %ptr = alloca i32                             ; yields i32*:ptr
9775      %ptr = alloca i32, i32 4                      ; yields i32*:ptr
9776      %ptr = alloca i32, i32 4, align 1024          ; yields i32*:ptr
9777      %ptr = alloca i32, align 1024                 ; yields i32*:ptr
9778
9779.. _i_load:
9780
9781'``load``' Instruction
9782^^^^^^^^^^^^^^^^^^^^^^
9783
9784Syntax:
9785"""""""
9786
9787::
9788
9789      <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
9790      <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
9791      !<nontemp_node> = !{ i32 1 }
9792      !<empty_node> = !{}
9793      !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
9794      !<align_node> = !{ i64 <value_alignment> }
9795
9796Overview:
9797"""""""""
9798
9799The '``load``' instruction is used to read from memory.
9800
9801Arguments:
9802""""""""""
9803
9804The argument to the ``load`` instruction specifies the memory address from which
9805to load. The type specified must be a :ref:`first class <t_firstclass>` type of
9806known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
9807the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
9808modify the number or order of execution of this ``load`` with other
9809:ref:`volatile operations <volatile>`.
9810
9811If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
9812<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9813``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
9814Atomic loads produce :ref:`defined <memmodel>` results when they may see
9815multiple atomic stores. The type of the pointee must be an integer, pointer, or
9816floating-point type whose bit width is a power of two greater than or equal to
9817eight and less than or equal to a target-specific size limit.  ``align`` must be
9818explicitly specified on atomic loads, and the load has undefined behavior if the
9819alignment is not set to a value which is at least the size in bytes of the
9820pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
9821
9822The optional constant ``align`` argument specifies the alignment of the
9823operation (that is, the alignment of the memory address). A value of 0
9824or an omitted ``align`` argument means that the operation has the ABI
9825alignment for the target. It is the responsibility of the code emitter
9826to ensure that the alignment information is correct. Overestimating the
9827alignment results in undefined behavior. Underestimating the alignment
9828may produce less efficient code. An alignment of 1 is always safe. The
9829maximum possible alignment is ``1 << 32``. An alignment value higher
9830than the size of the loaded type implies memory up to the alignment
9831value bytes can be safely loaded without trapping in the default
9832address space. Access of the high bytes can interfere with debugging
9833tools, so should not be accessed if the function has the
9834``sanitize_thread`` or ``sanitize_address`` attributes.
9835
9836The optional ``!nontemporal`` metadata must reference a single
9837metadata name ``<nontemp_node>`` corresponding to a metadata node with one
9838``i32`` entry of value 1. The existence of the ``!nontemporal``
9839metadata on the instruction tells the optimizer and code generator
9840that this load is not expected to be reused in the cache. The code
9841generator may select special instructions to save cache bandwidth, such
9842as the ``MOVNT`` instruction on x86.
9843
9844The optional ``!invariant.load`` metadata must reference a single
9845metadata name ``<empty_node>`` corresponding to a metadata node with no
9846entries. If a load instruction tagged with the ``!invariant.load``
9847metadata is executed, the memory location referenced by the load has
9848to contain the same value at all points in the program where the
9849memory location is dereferenceable; otherwise, the behavior is
9850undefined.
9851
9852The optional ``!invariant.group`` metadata must reference a single metadata name
9853 ``<empty_node>`` corresponding to a metadata node with no entries.
9854 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
9855
9856The optional ``!nonnull`` metadata must reference a single
9857metadata name ``<empty_node>`` corresponding to a metadata node with no
9858entries. The existence of the ``!nonnull`` metadata on the
9859instruction tells the optimizer that the value loaded is known to
9860never be null. If the value is null at runtime, the behavior is undefined.
9861This is analogous to the ``nonnull`` attribute on parameters and return
9862values. This metadata can only be applied to loads of a pointer type.
9863
9864The optional ``!dereferenceable`` metadata must reference a single metadata
9865name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
9866entry.
9867See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
9868
9869The optional ``!dereferenceable_or_null`` metadata must reference a single
9870metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
9871``i64`` entry.
9872See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
9873<md_dereferenceable_or_null>`.
9874
9875The optional ``!align`` metadata must reference a single metadata name
9876``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
9877The existence of the ``!align`` metadata on the instruction tells the
9878optimizer that the value loaded is known to be aligned to a boundary specified
9879by the integer value in the metadata node. The alignment must be a power of 2.
9880This is analogous to the ''align'' attribute on parameters and return values.
9881This metadata can only be applied to loads of a pointer type. If the returned
9882value is not appropriately aligned at runtime, the behavior is undefined.
9883
9884The optional ``!noundef`` metadata must reference a single metadata name
9885``<empty_node>`` corresponding to a node with no entries. The existence of
9886``!noundef`` metadata on the instruction tells the optimizer that the value
9887loaded is known to be :ref:`well defined <welldefinedvalues>`.
9888If the value isn't well defined, the behavior is undefined.
9889
9890Semantics:
9891""""""""""
9892
9893The location of memory pointed to is loaded. If the value being loaded
9894is of scalar type then the number of bytes read does not exceed the
9895minimum number of bytes needed to hold all bits of the type. For
9896example, loading an ``i24`` reads at most three bytes. When loading a
9897value of a type like ``i20`` with a size that is not an integral number
9898of bytes, the result is undefined if the value was not originally
9899written using a store of the same type.
9900If the value being loaded is of aggregate type, the bytes that correspond to
9901padding may be accessed but are ignored, because it is impossible to observe
9902padding from the loaded aggregate value.
9903If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9904
9905Examples:
9906"""""""""
9907
9908.. code-block:: llvm
9909
9910      %ptr = alloca i32                               ; yields i32*:ptr
9911      store i32 3, i32* %ptr                          ; yields void
9912      %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
9913
9914.. _i_store:
9915
9916'``store``' Instruction
9917^^^^^^^^^^^^^^^^^^^^^^^
9918
9919Syntax:
9920"""""""
9921
9922::
9923
9924      store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
9925      store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
9926      !<nontemp_node> = !{ i32 1 }
9927      !<empty_node> = !{}
9928
9929Overview:
9930"""""""""
9931
9932The '``store``' instruction is used to write to memory.
9933
9934Arguments:
9935""""""""""
9936
9937There are two arguments to the ``store`` instruction: a value to store and an
9938address at which to store it. The type of the ``<pointer>`` operand must be a
9939pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
9940operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
9941allowed to modify the number or order of execution of this ``store`` with other
9942:ref:`volatile operations <volatile>`.  Only values of :ref:`first class
9943<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
9944structural type <t_opaque>`) can be stored.
9945
9946If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
9947<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9948``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
9949Atomic loads produce :ref:`defined <memmodel>` results when they may see
9950multiple atomic stores. The type of the pointee must be an integer, pointer, or
9951floating-point type whose bit width is a power of two greater than or equal to
9952eight and less than or equal to a target-specific size limit.  ``align`` must be
9953explicitly specified on atomic stores, and the store has undefined behavior if
9954the alignment is not set to a value which is at least the size in bytes of the
9955pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
9956
9957The optional constant ``align`` argument specifies the alignment of the
9958operation (that is, the alignment of the memory address). A value of 0
9959or an omitted ``align`` argument means that the operation has the ABI
9960alignment for the target. It is the responsibility of the code emitter
9961to ensure that the alignment information is correct. Overestimating the
9962alignment results in undefined behavior. Underestimating the
9963alignment may produce less efficient code. An alignment of 1 is always
9964safe. The maximum possible alignment is ``1 << 32``. An alignment
9965value higher than the size of the stored type implies memory up to the
9966alignment value bytes can be stored to without trapping in the default
9967address space. Storing to the higher bytes however may result in data
9968races if another thread can access the same address. Introducing a
9969data race is not allowed. Storing to the extra bytes is not allowed
9970even in situations where a data race is known to not exist if the
9971function has the ``sanitize_address`` attribute.
9972
9973The optional ``!nontemporal`` metadata must reference a single metadata
9974name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
9975of value 1. The existence of the ``!nontemporal`` metadata on the instruction
9976tells the optimizer and code generator that this load is not expected to
9977be reused in the cache. The code generator may select special
9978instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
9979x86.
9980
9981The optional ``!invariant.group`` metadata must reference a
9982single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
9983
9984Semantics:
9985""""""""""
9986
9987The contents of memory are updated to contain ``<value>`` at the
9988location specified by the ``<pointer>`` operand. If ``<value>`` is
9989of scalar type then the number of bytes written does not exceed the
9990minimum number of bytes needed to hold all bits of the type. For
9991example, storing an ``i24`` writes at most three bytes. When writing a
9992value of a type like ``i20`` with a size that is not an integral number
9993of bytes, it is unspecified what happens to the extra bits that do not
9994belong to the type, but they will typically be overwritten.
9995If ``<value>`` is of aggregate type, padding is filled with
9996:ref:`undef <undefvalues>`.
9997If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9998
9999Example:
10000""""""""
10001
10002.. code-block:: llvm
10003
10004      %ptr = alloca i32                               ; yields i32*:ptr
10005      store i32 3, i32* %ptr                          ; yields void
10006      %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
10007
10008.. _i_fence:
10009
10010'``fence``' Instruction
10011^^^^^^^^^^^^^^^^^^^^^^^
10012
10013Syntax:
10014"""""""
10015
10016::
10017
10018      fence [syncscope("<target-scope>")] <ordering>  ; yields void
10019
10020Overview:
10021"""""""""
10022
10023The '``fence``' instruction is used to introduce happens-before edges
10024between operations.
10025
10026Arguments:
10027""""""""""
10028
10029'``fence``' instructions take an :ref:`ordering <ordering>` argument which
10030defines what *synchronizes-with* edges they add. They can only be given
10031``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10032
10033Semantics:
10034""""""""""
10035
10036A fence A which has (at least) ``release`` ordering semantics
10037*synchronizes with* a fence B with (at least) ``acquire`` ordering
10038semantics if and only if there exist atomic operations X and Y, both
10039operating on some atomic object M, such that A is sequenced before X, X
10040modifies M (either directly or through some side effect of a sequence
10041headed by X), Y is sequenced before B, and Y observes M. This provides a
10042*happens-before* dependency between A and B. Rather than an explicit
10043``fence``, one (but not both) of the atomic operations X or Y might
10044provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10045still *synchronize-with* the explicit ``fence`` and establish the
10046*happens-before* edge.
10047
10048A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10049``acquire`` and ``release`` semantics specified above, participates in
10050the global program order of other ``seq_cst`` operations and/or fences.
10051
10052A ``fence`` instruction can also take an optional
10053":ref:`syncscope <syncscope>`" argument.
10054
10055Example:
10056""""""""
10057
10058.. code-block:: text
10059
10060      fence acquire                                        ; yields void
10061      fence syncscope("singlethread") seq_cst              ; yields void
10062      fence syncscope("agent") seq_cst                     ; yields void
10063
10064.. _i_cmpxchg:
10065
10066'``cmpxchg``' Instruction
10067^^^^^^^^^^^^^^^^^^^^^^^^^
10068
10069Syntax:
10070"""""""
10071
10072::
10073
10074      cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
10075
10076Overview:
10077"""""""""
10078
10079The '``cmpxchg``' instruction is used to atomically modify memory. It
10080loads a value in memory and compares it to a given value. If they are
10081equal, it tries to store a new value into the memory.
10082
10083Arguments:
10084""""""""""
10085
10086There are three arguments to the '``cmpxchg``' instruction: an address
10087to operate on, a value to compare to the value currently be at that
10088address, and a new value to place at that address if the compared values
10089are equal. The type of '<cmp>' must be an integer or pointer type whose
10090bit width is a power of two greater than or equal to eight and less
10091than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10092have the same type, and the type of '<pointer>' must be a pointer to
10093that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10094optimizer is not allowed to modify the number or order of execution of
10095this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10096
10097The success and failure :ref:`ordering <ordering>` arguments specify how this
10098``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10099must be at least ``monotonic``, the failure ordering cannot be either
10100``release`` or ``acq_rel``.
10101
10102A ``cmpxchg`` instruction can also take an optional
10103":ref:`syncscope <syncscope>`" argument.
10104
10105The instruction can take an optional ``align`` attribute.
10106The alignment must be a power of two greater or equal to the size of the
10107`<value>` type. If unspecified, the alignment is assumed to be equal to the
10108size of the '<value>' type. Note that this default alignment assumption is
10109different from the alignment used for the load/store instructions when align
10110isn't specified.
10111
10112The pointer passed into cmpxchg must have alignment greater than or
10113equal to the size in memory of the operand.
10114
10115Semantics:
10116""""""""""
10117
10118The contents of memory at the location specified by the '``<pointer>``' operand
10119is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10120written to the location. The original value at the location is returned,
10121together with a flag indicating success (true) or failure (false).
10122
10123If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10124permitted: the operation may not write ``<new>`` even if the comparison
10125matched.
10126
10127If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10128if the value loaded equals ``cmp``.
10129
10130A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10131identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10132load with an ordering parameter determined the second ordering parameter.
10133
10134Example:
10135""""""""
10136
10137.. code-block:: llvm
10138
10139    entry:
10140      %orig = load atomic i32, i32* %ptr unordered, align 4                      ; yields i32
10141      br label %loop
10142
10143    loop:
10144      %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10145      %squared = mul i32 %cmp, %cmp
10146      %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
10147      %value_loaded = extractvalue { i32, i1 } %val_success, 0
10148      %success = extractvalue { i32, i1 } %val_success, 1
10149      br i1 %success, label %done, label %loop
10150
10151    done:
10152      ...
10153
10154.. _i_atomicrmw:
10155
10156'``atomicrmw``' Instruction
10157^^^^^^^^^^^^^^^^^^^^^^^^^^^
10158
10159Syntax:
10160"""""""
10161
10162::
10163
10164      atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
10165
10166Overview:
10167"""""""""
10168
10169The '``atomicrmw``' instruction is used to atomically modify memory.
10170
10171Arguments:
10172""""""""""
10173
10174There are three arguments to the '``atomicrmw``' instruction: an
10175operation to apply, an address whose value to modify, an argument to the
10176operation. The operation must be one of the following keywords:
10177
10178-  xchg
10179-  add
10180-  sub
10181-  and
10182-  nand
10183-  or
10184-  xor
10185-  max
10186-  min
10187-  umax
10188-  umin
10189-  fadd
10190-  fsub
10191
10192For most of these operations, the type of '<value>' must be an integer
10193type whose bit width is a power of two greater than or equal to eight
10194and less than or equal to a target-specific size limit. For xchg, this
10195may also be a floating point type with the same size constraints as
10196integers.  For fadd/fsub, this must be a floating point type.  The
10197type of the '``<pointer>``' operand must be a pointer to that type. If
10198the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10199allowed to modify the number or order of execution of this
10200``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10201
10202The instruction can take an optional ``align`` attribute.
10203The alignment must be a power of two greater or equal to the size of the
10204`<value>` type. If unspecified, the alignment is assumed to be equal to the
10205size of the '<value>' type. Note that this default alignment assumption is
10206different from the alignment used for the load/store instructions when align
10207isn't specified.
10208
10209A ``atomicrmw`` instruction can also take an optional
10210":ref:`syncscope <syncscope>`" argument.
10211
10212Semantics:
10213""""""""""
10214
10215The contents of memory at the location specified by the '``<pointer>``'
10216operand are atomically read, modified, and written back. The original
10217value at the location is returned. The modification is specified by the
10218operation argument:
10219
10220-  xchg: ``*ptr = val``
10221-  add: ``*ptr = *ptr + val``
10222-  sub: ``*ptr = *ptr - val``
10223-  and: ``*ptr = *ptr & val``
10224-  nand: ``*ptr = ~(*ptr & val)``
10225-  or: ``*ptr = *ptr | val``
10226-  xor: ``*ptr = *ptr ^ val``
10227-  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10228-  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10229-  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10230-  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10231- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10232- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10233
10234Example:
10235""""""""
10236
10237.. code-block:: llvm
10238
10239      %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields i32
10240
10241.. _i_getelementptr:
10242
10243'``getelementptr``' Instruction
10244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10245
10246Syntax:
10247"""""""
10248
10249::
10250
10251      <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10252      <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10253      <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
10254
10255Overview:
10256"""""""""
10257
10258The '``getelementptr``' instruction is used to get the address of a
10259subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10260address calculation only and does not access memory. The instruction can also
10261be used to calculate a vector of such addresses.
10262
10263Arguments:
10264""""""""""
10265
10266The first argument is always a type used as the basis for the calculations.
10267The second argument is always a pointer or a vector of pointers, and is the
10268base address to start from. The remaining arguments are indices
10269that indicate which of the elements of the aggregate object are indexed.
10270The interpretation of each index is dependent on the type being indexed
10271into. The first index always indexes the pointer value given as the
10272second argument, the second index indexes a value of the type pointed to
10273(not necessarily the value directly pointed to, since the first index
10274can be non-zero), etc. The first type indexed into must be a pointer
10275value, subsequent types can be arrays, vectors, and structs. Note that
10276subsequent types being indexed into can never be pointers, since that
10277would require loading the pointer before continuing calculation.
10278
10279The type of each index argument depends on the type it is indexing into.
10280When indexing into a (optionally packed) structure, only ``i32`` integer
10281**constants** are allowed (when using a vector of indices they must all
10282be the **same** ``i32`` integer constant). When indexing into an array,
10283pointer or vector, integers of any width are allowed, and they are not
10284required to be constant. These integers are treated as signed values
10285where relevant.
10286
10287For example, let's consider a C code fragment and how it gets compiled
10288to LLVM:
10289
10290.. code-block:: c
10291
10292    struct RT {
10293      char A;
10294      int B[10][20];
10295      char C;
10296    };
10297    struct ST {
10298      int X;
10299      double Y;
10300      struct RT Z;
10301    };
10302
10303    int *foo(struct ST *s) {
10304      return &s[1].Z.B[5][13];
10305    }
10306
10307The LLVM code generated by Clang is:
10308
10309.. code-block:: llvm
10310
10311    %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10312    %struct.ST = type { i32, double, %struct.RT }
10313
10314    define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
10315    entry:
10316      %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
10317      ret i32* %arrayidx
10318    }
10319
10320Semantics:
10321""""""""""
10322
10323In the example above, the first index is indexing into the
10324'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10325= '``{ i32, double, %struct.RT }``' type, a structure. The second index
10326indexes into the third element of the structure, yielding a
10327'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10328structure. The third index indexes into the second element of the
10329structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10330dimensions of the array are subscripted into, yielding an '``i32``'
10331type. The '``getelementptr``' instruction returns a pointer to this
10332element, thus computing a value of '``i32*``' type.
10333
10334Note that it is perfectly legal to index partially through a structure,
10335returning a pointer to an inner element. Because of this, the LLVM code
10336for the given testcase is equivalent to:
10337
10338.. code-block:: llvm
10339
10340    define i32* @foo(%struct.ST* %s) {
10341      %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
10342      %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
10343      %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
10344      %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
10345      %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
10346      ret i32* %t5
10347    }
10348
10349If the ``inbounds`` keyword is present, the result value of the
10350``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10351following rules is violated:
10352
10353*  The base pointer has an *in bounds* address of an allocated object, which
10354   means that it points into an allocated object, or to its end. The only
10355   *in bounds* address for a null pointer in the default address-space is the
10356   null pointer itself.
10357*  If the type of an index is larger than the pointer index type, the
10358   truncation to the pointer index type preserves the signed value.
10359*  The multiplication of an index by the type size does not wrap the pointer
10360   index type in a signed sense (``nsw``).
10361*  The successive addition of offsets (without adding the base address) does
10362   not wrap the pointer index type in a signed sense (``nsw``).
10363*  The successive addition of the current address, interpreted as an unsigned
10364   number, and an offset, interpreted as a signed number, does not wrap the
10365   unsigned address space and remains *in bounds* of the allocated object.
10366   As a corollary, if the added offset is non-negative, the addition does not
10367   wrap in an unsigned sense (``nuw``).
10368*  In cases where the base is a vector of pointers, the ``inbounds`` keyword
10369   applies to each of the computations element-wise.
10370
10371These rules are based on the assumption that no allocated object may cross
10372the unsigned address space boundary, and no allocated object may be larger
10373than half the pointer index type space.
10374
10375If the ``inbounds`` keyword is not present, the offsets are added to the
10376base address with silently-wrapping two's complement arithmetic. If the
10377offsets have a different width from the pointer, they are sign-extended
10378or truncated to the width of the pointer. The result value of the
10379``getelementptr`` may be outside the object pointed to by the base
10380pointer. The result value may not necessarily be used to access memory
10381though, even if it happens to point into allocated storage. See the
10382:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10383information.
10384
10385If the ``inrange`` keyword is present before any index, loading from or
10386storing to any pointer derived from the ``getelementptr`` has undefined
10387behavior if the load or store would access memory outside of the bounds of
10388the element selected by the index marked as ``inrange``. The result of a
10389pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10390involving memory) involving a pointer derived from a ``getelementptr`` with
10391the ``inrange`` keyword is undefined, with the exception of comparisons
10392in the case where both operands are in the range of the element selected
10393by the ``inrange`` keyword, inclusive of the address one past the end of
10394that element. Note that the ``inrange`` keyword is currently only allowed
10395in constant ``getelementptr`` expressions.
10396
10397The getelementptr instruction is often confusing. For some more insight
10398into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10399
10400Example:
10401""""""""
10402
10403.. code-block:: llvm
10404
10405        ; yields [12 x i8]*:aptr
10406        %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
10407        ; yields i8*:vptr
10408        %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
10409        ; yields i8*:eptr
10410        %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
10411        ; yields i32*:iptr
10412        %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
10413
10414Vector of pointers:
10415"""""""""""""""""""
10416
10417The ``getelementptr`` returns a vector of pointers, instead of a single address,
10418when one or more of its arguments is a vector. In such cases, all vector
10419arguments should have the same number of elements, and every scalar argument
10420will be effectively broadcast into a vector during address calculation.
10421
10422.. code-block:: llvm
10423
10424     ; All arguments are vectors:
10425     ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10426     %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10427
10428     ; Add the same scalar offset to each pointer of a vector:
10429     ;   A[i] = ptrs[i] + offset*sizeof(i8)
10430     %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
10431
10432     ; Add distinct offsets to the same pointer:
10433     ;   A[i] = ptr + offsets[i]*sizeof(i8)
10434     %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
10435
10436     ; In all cases described above the type of the result is <4 x i8*>
10437
10438The two following instructions are equivalent:
10439
10440.. code-block:: llvm
10441
10442     getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10443       <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
10444       <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
10445       <4 x i32> %ind4,
10446       <4 x i64> <i64 13, i64 13, i64 13, i64 13>
10447
10448     getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10449       i32 2, i32 1, <4 x i32> %ind4, i64 13
10450
10451Let's look at the C code, where the vector version of ``getelementptr``
10452makes sense:
10453
10454.. code-block:: c
10455
10456    // Let's assume that we vectorize the following loop:
10457    double *A, *B; int *C;
10458    for (int i = 0; i < size; ++i) {
10459      A[i] = B[C[i]];
10460    }
10461
10462.. code-block:: llvm
10463
10464    ; get pointers for 8 elements from array B
10465    %ptrs = getelementptr double, double* %B, <8 x i32> %C
10466    ; load 8 elements from array B into A
10467    %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
10468         i32 8, <8 x i1> %mask, <8 x double> %passthru)
10469
10470Conversion Operations
10471---------------------
10472
10473The instructions in this category are the conversion instructions
10474(casting) which all take a single operand and a type. They perform
10475various bit conversions on the operand.
10476
10477.. _i_trunc:
10478
10479'``trunc .. to``' Instruction
10480^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10481
10482Syntax:
10483"""""""
10484
10485::
10486
10487      <result> = trunc <ty> <value> to <ty2>             ; yields ty2
10488
10489Overview:
10490"""""""""
10491
10492The '``trunc``' instruction truncates its operand to the type ``ty2``.
10493
10494Arguments:
10495""""""""""
10496
10497The '``trunc``' instruction takes a value to trunc, and a type to trunc
10498it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
10499of the same number of integers. The bit size of the ``value`` must be
10500larger than the bit size of the destination type, ``ty2``. Equal sized
10501types are not allowed.
10502
10503Semantics:
10504""""""""""
10505
10506The '``trunc``' instruction truncates the high order bits in ``value``
10507and converts the remaining bits to ``ty2``. Since the source size must
10508be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
10509It will always truncate bits.
10510
10511Example:
10512""""""""
10513
10514.. code-block:: llvm
10515
10516      %X = trunc i32 257 to i8                        ; yields i8:1
10517      %Y = trunc i32 123 to i1                        ; yields i1:true
10518      %Z = trunc i32 122 to i1                        ; yields i1:false
10519      %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
10520
10521.. _i_zext:
10522
10523'``zext .. to``' Instruction
10524^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10525
10526Syntax:
10527"""""""
10528
10529::
10530
10531      <result> = zext <ty> <value> to <ty2>             ; yields ty2
10532
10533Overview:
10534"""""""""
10535
10536The '``zext``' instruction zero extends its operand to type ``ty2``.
10537
10538Arguments:
10539""""""""""
10540
10541The '``zext``' instruction takes a value to cast, and a type to cast it
10542to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10543the same number of integers. The bit size of the ``value`` must be
10544smaller than the bit size of the destination type, ``ty2``.
10545
10546Semantics:
10547""""""""""
10548
10549The ``zext`` fills the high order bits of the ``value`` with zero bits
10550until it reaches the size of the destination type, ``ty2``.
10551
10552When zero extending from i1, the result will always be either 0 or 1.
10553
10554Example:
10555""""""""
10556
10557.. code-block:: llvm
10558
10559      %X = zext i32 257 to i64              ; yields i64:257
10560      %Y = zext i1 true to i32              ; yields i32:1
10561      %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10562
10563.. _i_sext:
10564
10565'``sext .. to``' Instruction
10566^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10567
10568Syntax:
10569"""""""
10570
10571::
10572
10573      <result> = sext <ty> <value> to <ty2>             ; yields ty2
10574
10575Overview:
10576"""""""""
10577
10578The '``sext``' sign extends ``value`` to the type ``ty2``.
10579
10580Arguments:
10581""""""""""
10582
10583The '``sext``' instruction takes a value to cast, and a type to cast it
10584to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10585the same number of integers. The bit size of the ``value`` must be
10586smaller than the bit size of the destination type, ``ty2``.
10587
10588Semantics:
10589""""""""""
10590
10591The '``sext``' instruction performs a sign extension by copying the sign
10592bit (highest order bit) of the ``value`` until it reaches the bit size
10593of the type ``ty2``.
10594
10595When sign extending from i1, the extension always results in -1 or 0.
10596
10597Example:
10598""""""""
10599
10600.. code-block:: llvm
10601
10602      %X = sext i8  -1 to i16              ; yields i16   :65535
10603      %Y = sext i1 true to i32             ; yields i32:-1
10604      %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10605
10606'``fptrunc .. to``' Instruction
10607^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10608
10609Syntax:
10610"""""""
10611
10612::
10613
10614      <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
10615
10616Overview:
10617"""""""""
10618
10619The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
10620
10621Arguments:
10622""""""""""
10623
10624The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
10625value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
10626The size of ``value`` must be larger than the size of ``ty2``. This
10627implies that ``fptrunc`` cannot be used to make a *no-op cast*.
10628
10629Semantics:
10630""""""""""
10631
10632The '``fptrunc``' instruction casts a ``value`` from a larger
10633:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
10634<t_floating>` type.
10635This instruction is assumed to execute in the default :ref:`floating-point
10636environment <floatenv>`.
10637
10638Example:
10639""""""""
10640
10641.. code-block:: llvm
10642
10643      %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
10644      %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
10645
10646'``fpext .. to``' Instruction
10647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10648
10649Syntax:
10650"""""""
10651
10652::
10653
10654      <result> = fpext <ty> <value> to <ty2>             ; yields ty2
10655
10656Overview:
10657"""""""""
10658
10659The '``fpext``' extends a floating-point ``value`` to a larger floating-point
10660value.
10661
10662Arguments:
10663""""""""""
10664
10665The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
10666``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
10667to. The source type must be smaller than the destination type.
10668
10669Semantics:
10670""""""""""
10671
10672The '``fpext``' instruction extends the ``value`` from a smaller
10673:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
10674<t_floating>` type. The ``fpext`` cannot be used to make a
10675*no-op cast* because it always changes bits. Use ``bitcast`` to make a
10676*no-op cast* for a floating-point cast.
10677
10678Example:
10679""""""""
10680
10681.. code-block:: llvm
10682
10683      %X = fpext float 3.125 to double         ; yields double:3.125000e+00
10684      %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
10685
10686'``fptoui .. to``' Instruction
10687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10688
10689Syntax:
10690"""""""
10691
10692::
10693
10694      <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
10695
10696Overview:
10697"""""""""
10698
10699The '``fptoui``' converts a floating-point ``value`` to its unsigned
10700integer equivalent of type ``ty2``.
10701
10702Arguments:
10703""""""""""
10704
10705The '``fptoui``' instruction takes a value to cast, which must be a
10706scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10707cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10708``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10709type with the same number of elements as ``ty``
10710
10711Semantics:
10712""""""""""
10713
10714The '``fptoui``' instruction converts its :ref:`floating-point
10715<t_floating>` operand into the nearest (rounding towards zero)
10716unsigned integer value. If the value cannot fit in ``ty2``, the result
10717is a :ref:`poison value <poisonvalues>`.
10718
10719Example:
10720""""""""
10721
10722.. code-block:: llvm
10723
10724      %X = fptoui double 123.0 to i32      ; yields i32:123
10725      %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
10726      %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
10727
10728'``fptosi .. to``' Instruction
10729^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10730
10731Syntax:
10732"""""""
10733
10734::
10735
10736      <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
10737
10738Overview:
10739"""""""""
10740
10741The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
10742``value`` to type ``ty2``.
10743
10744Arguments:
10745""""""""""
10746
10747The '``fptosi``' instruction takes a value to cast, which must be a
10748scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10749cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10750``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10751type with the same number of elements as ``ty``
10752
10753Semantics:
10754""""""""""
10755
10756The '``fptosi``' instruction converts its :ref:`floating-point
10757<t_floating>` operand into the nearest (rounding towards zero)
10758signed integer value. If the value cannot fit in ``ty2``, the result
10759is a :ref:`poison value <poisonvalues>`.
10760
10761Example:
10762""""""""
10763
10764.. code-block:: llvm
10765
10766      %X = fptosi double -123.0 to i32      ; yields i32:-123
10767      %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
10768      %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
10769
10770'``uitofp .. to``' Instruction
10771^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10772
10773Syntax:
10774"""""""
10775
10776::
10777
10778      <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
10779
10780Overview:
10781"""""""""
10782
10783The '``uitofp``' instruction regards ``value`` as an unsigned integer
10784and converts that value to the ``ty2`` type.
10785
10786Arguments:
10787""""""""""
10788
10789The '``uitofp``' instruction takes a value to cast, which must be a
10790scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10791``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10792``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10793type with the same number of elements as ``ty``
10794
10795Semantics:
10796""""""""""
10797
10798The '``uitofp``' instruction interprets its operand as an unsigned
10799integer quantity and converts it to the corresponding floating-point
10800value. If the value cannot be exactly represented, it is rounded using
10801the default rounding mode.
10802
10803
10804Example:
10805""""""""
10806
10807.. code-block:: llvm
10808
10809      %X = uitofp i32 257 to float         ; yields float:257.0
10810      %Y = uitofp i8 -1 to double          ; yields double:255.0
10811
10812'``sitofp .. to``' Instruction
10813^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10814
10815Syntax:
10816"""""""
10817
10818::
10819
10820      <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
10821
10822Overview:
10823"""""""""
10824
10825The '``sitofp``' instruction regards ``value`` as a signed integer and
10826converts that value to the ``ty2`` type.
10827
10828Arguments:
10829""""""""""
10830
10831The '``sitofp``' instruction takes a value to cast, which must be a
10832scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10833``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10834``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10835type with the same number of elements as ``ty``
10836
10837Semantics:
10838""""""""""
10839
10840The '``sitofp``' instruction interprets its operand as a signed integer
10841quantity and converts it to the corresponding floating-point value. If the
10842value cannot be exactly represented, it is rounded using the default rounding
10843mode.
10844
10845Example:
10846""""""""
10847
10848.. code-block:: llvm
10849
10850      %X = sitofp i32 257 to float         ; yields float:257.0
10851      %Y = sitofp i8 -1 to double          ; yields double:-1.0
10852
10853.. _i_ptrtoint:
10854
10855'``ptrtoint .. to``' Instruction
10856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10857
10858Syntax:
10859"""""""
10860
10861::
10862
10863      <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
10864
10865Overview:
10866"""""""""
10867
10868The '``ptrtoint``' instruction converts the pointer or a vector of
10869pointers ``value`` to the integer (or vector of integers) type ``ty2``.
10870
10871Arguments:
10872""""""""""
10873
10874The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
10875a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
10876type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
10877a vector of integers type.
10878
10879Semantics:
10880""""""""""
10881
10882The '``ptrtoint``' instruction converts ``value`` to integer type
10883``ty2`` by interpreting the pointer value as an integer and either
10884truncating or zero extending that value to the size of the integer type.
10885If ``value`` is smaller than ``ty2`` then a zero extension is done. If
10886``value`` is larger than ``ty2`` then a truncation is done. If they are
10887the same size, then nothing is done (*no-op cast*) other than a type
10888change.
10889
10890Example:
10891""""""""
10892
10893.. code-block:: llvm
10894
10895      %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
10896      %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
10897      %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
10898
10899.. _i_inttoptr:
10900
10901'``inttoptr .. to``' Instruction
10902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10903
10904Syntax:
10905"""""""
10906
10907::
10908
10909      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
10910
10911Overview:
10912"""""""""
10913
10914The '``inttoptr``' instruction converts an integer ``value`` to a
10915pointer type, ``ty2``.
10916
10917Arguments:
10918""""""""""
10919
10920The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
10921cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
10922type.
10923
10924The optional ``!dereferenceable`` metadata must reference a single metadata
10925name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10926entry.
10927See ``dereferenceable`` metadata.
10928
10929The optional ``!dereferenceable_or_null`` metadata must reference a single
10930metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10931``i64`` entry.
10932See ``dereferenceable_or_null`` metadata.
10933
10934Semantics:
10935""""""""""
10936
10937The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
10938applying either a zero extension or a truncation depending on the size
10939of the integer ``value``. If ``value`` is larger than the size of a
10940pointer then a truncation is done. If ``value`` is smaller than the size
10941of a pointer then a zero extension is done. If they are the same size,
10942nothing is done (*no-op cast*).
10943
10944Example:
10945""""""""
10946
10947.. code-block:: llvm
10948
10949      %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
10950      %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
10951      %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
10952      %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
10953
10954.. _i_bitcast:
10955
10956'``bitcast .. to``' Instruction
10957^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10958
10959Syntax:
10960"""""""
10961
10962::
10963
10964      <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
10965
10966Overview:
10967"""""""""
10968
10969The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
10970changing any bits.
10971
10972Arguments:
10973""""""""""
10974
10975The '``bitcast``' instruction takes a value to cast, which must be a
10976non-aggregate first class value, and a type to cast it to, which must
10977also be a non-aggregate :ref:`first class <t_firstclass>` type. The
10978bit sizes of ``value`` and the destination type, ``ty2``, must be
10979identical. If the source type is a pointer, the destination type must
10980also be a pointer of the same size. This instruction supports bitwise
10981conversion of vectors to integers and to vectors of other types (as
10982long as they have the same size).
10983
10984Semantics:
10985""""""""""
10986
10987The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
10988is always a *no-op cast* because no bits change with this
10989conversion. The conversion is done as if the ``value`` had been stored
10990to memory and read back as type ``ty2``. Pointer (or vector of
10991pointers) types may only be converted to other pointer (or vector of
10992pointers) types with the same address space through this instruction.
10993To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
10994or :ref:`ptrtoint <i_ptrtoint>` instructions first.
10995
10996There is a caveat for bitcasts involving vector types in relation to
10997endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
10998of the vector in the least significant bits of the i16 for little-endian while
10999element zero ends up in the most significant bits for big-endian.
11000
11001Example:
11002""""""""
11003
11004.. code-block:: text
11005
11006      %X = bitcast i8 255 to i8          ; yields i8 :-1
11007      %Y = bitcast i32* %x to sint*      ; yields sint*:%x
11008      %Z = bitcast <2 x int> %V to i64;  ; yields i64: %V (depends on endianess)
11009      %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11010
11011.. _i_addrspacecast:
11012
11013'``addrspacecast .. to``' Instruction
11014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11015
11016Syntax:
11017"""""""
11018
11019::
11020
11021      <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
11022
11023Overview:
11024"""""""""
11025
11026The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11027address space ``n`` to type ``pty2`` in address space ``m``.
11028
11029Arguments:
11030""""""""""
11031
11032The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11033to cast and a pointer type to cast it to, which must have a different
11034address space.
11035
11036Semantics:
11037""""""""""
11038
11039The '``addrspacecast``' instruction converts the pointer value
11040``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11041value modification, depending on the target and the address space
11042pair. Pointer conversions within the same address space must be
11043performed with the ``bitcast`` instruction. Note that if the address space
11044conversion is legal then both result and operand refer to the same memory
11045location.
11046
11047Example:
11048""""""""
11049
11050.. code-block:: llvm
11051
11052      %X = addrspacecast i32* %x to i32 addrspace(1)*    ; yields i32 addrspace(1)*:%x
11053      %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)*    ; yields i64 addrspace(2)*:%y
11054      %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*>   ; yields <4 x float addrspace(3)*>:%z
11055
11056.. _otherops:
11057
11058Other Operations
11059----------------
11060
11061The instructions in this category are the "miscellaneous" instructions,
11062which defy better classification.
11063
11064.. _i_icmp:
11065
11066'``icmp``' Instruction
11067^^^^^^^^^^^^^^^^^^^^^^
11068
11069Syntax:
11070"""""""
11071
11072::
11073
11074      <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
11075
11076Overview:
11077"""""""""
11078
11079The '``icmp``' instruction returns a boolean value or a vector of
11080boolean values based on comparison of its two integer, integer vector,
11081pointer, or pointer vector operands.
11082
11083Arguments:
11084""""""""""
11085
11086The '``icmp``' instruction takes three operands. The first operand is
11087the condition code indicating the kind of comparison to perform. It is
11088not a value, just a keyword. The possible condition codes are:
11089
11090#. ``eq``: equal
11091#. ``ne``: not equal
11092#. ``ugt``: unsigned greater than
11093#. ``uge``: unsigned greater or equal
11094#. ``ult``: unsigned less than
11095#. ``ule``: unsigned less or equal
11096#. ``sgt``: signed greater than
11097#. ``sge``: signed greater or equal
11098#. ``slt``: signed less than
11099#. ``sle``: signed less or equal
11100
11101The remaining two arguments must be :ref:`integer <t_integer>` or
11102:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11103must also be identical types.
11104
11105Semantics:
11106""""""""""
11107
11108The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11109code given as ``cond``. The comparison performed always yields either an
11110:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11111
11112#. ``eq``: yields ``true`` if the operands are equal, ``false``
11113   otherwise. No sign interpretation is necessary or performed.
11114#. ``ne``: yields ``true`` if the operands are unequal, ``false``
11115   otherwise. No sign interpretation is necessary or performed.
11116#. ``ugt``: interprets the operands as unsigned values and yields
11117   ``true`` if ``op1`` is greater than ``op2``.
11118#. ``uge``: interprets the operands as unsigned values and yields
11119   ``true`` if ``op1`` is greater than or equal to ``op2``.
11120#. ``ult``: interprets the operands as unsigned values and yields
11121   ``true`` if ``op1`` is less than ``op2``.
11122#. ``ule``: interprets the operands as unsigned values and yields
11123   ``true`` if ``op1`` is less than or equal to ``op2``.
11124#. ``sgt``: interprets the operands as signed values and yields ``true``
11125   if ``op1`` is greater than ``op2``.
11126#. ``sge``: interprets the operands as signed values and yields ``true``
11127   if ``op1`` is greater than or equal to ``op2``.
11128#. ``slt``: interprets the operands as signed values and yields ``true``
11129   if ``op1`` is less than ``op2``.
11130#. ``sle``: interprets the operands as signed values and yields ``true``
11131   if ``op1`` is less than or equal to ``op2``.
11132
11133If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11134are compared as if they were integers.
11135
11136If the operands are integer vectors, then they are compared element by
11137element. The result is an ``i1`` vector with the same number of elements
11138as the values being compared. Otherwise, the result is an ``i1``.
11139
11140Example:
11141""""""""
11142
11143.. code-block:: text
11144
11145      <result> = icmp eq i32 4, 5          ; yields: result=false
11146      <result> = icmp ne float* %X, %X     ; yields: result=false
11147      <result> = icmp ult i16  4, 5        ; yields: result=true
11148      <result> = icmp sgt i16  4, 5        ; yields: result=false
11149      <result> = icmp ule i16 -4, 5        ; yields: result=false
11150      <result> = icmp sge i16  4, 5        ; yields: result=false
11151
11152.. _i_fcmp:
11153
11154'``fcmp``' Instruction
11155^^^^^^^^^^^^^^^^^^^^^^
11156
11157Syntax:
11158"""""""
11159
11160::
11161
11162      <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
11163
11164Overview:
11165"""""""""
11166
11167The '``fcmp``' instruction returns a boolean value or vector of boolean
11168values based on comparison of its operands.
11169
11170If the operands are floating-point scalars, then the result type is a
11171boolean (:ref:`i1 <t_integer>`).
11172
11173If the operands are floating-point vectors, then the result type is a
11174vector of boolean with the same number of elements as the operands being
11175compared.
11176
11177Arguments:
11178""""""""""
11179
11180The '``fcmp``' instruction takes three operands. The first operand is
11181the condition code indicating the kind of comparison to perform. It is
11182not a value, just a keyword. The possible condition codes are:
11183
11184#. ``false``: no comparison, always returns false
11185#. ``oeq``: ordered and equal
11186#. ``ogt``: ordered and greater than
11187#. ``oge``: ordered and greater than or equal
11188#. ``olt``: ordered and less than
11189#. ``ole``: ordered and less than or equal
11190#. ``one``: ordered and not equal
11191#. ``ord``: ordered (no nans)
11192#. ``ueq``: unordered or equal
11193#. ``ugt``: unordered or greater than
11194#. ``uge``: unordered or greater than or equal
11195#. ``ult``: unordered or less than
11196#. ``ule``: unordered or less than or equal
11197#. ``une``: unordered or not equal
11198#. ``uno``: unordered (either nans)
11199#. ``true``: no comparison, always returns true
11200
11201*Ordered* means that neither operand is a QNAN while *unordered* means
11202that either operand may be a QNAN.
11203
11204Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11205<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11206They must have identical types.
11207
11208Semantics:
11209""""""""""
11210
11211The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11212condition code given as ``cond``. If the operands are vectors, then the
11213vectors are compared element by element. Each comparison performed
11214always yields an :ref:`i1 <t_integer>` result, as follows:
11215
11216#. ``false``: always yields ``false``, regardless of operands.
11217#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11218   is equal to ``op2``.
11219#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11220   is greater than ``op2``.
11221#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11222   is greater than or equal to ``op2``.
11223#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11224   is less than ``op2``.
11225#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11226   is less than or equal to ``op2``.
11227#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11228   is not equal to ``op2``.
11229#. ``ord``: yields ``true`` if both operands are not a QNAN.
11230#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11231   equal to ``op2``.
11232#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11233   greater than ``op2``.
11234#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11235   greater than or equal to ``op2``.
11236#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11237   less than ``op2``.
11238#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11239   less than or equal to ``op2``.
11240#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11241   not equal to ``op2``.
11242#. ``uno``: yields ``true`` if either operand is a QNAN.
11243#. ``true``: always yields ``true``, regardless of operands.
11244
11245The ``fcmp`` instruction can also optionally take any number of
11246:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11247otherwise unsafe floating-point optimizations.
11248
11249Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11250only flags that have any effect on its semantics are those that allow
11251assumptions to be made about the values of input arguments; namely
11252``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11253
11254Example:
11255""""""""
11256
11257.. code-block:: text
11258
11259      <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
11260      <result> = fcmp one float 4.0, 5.0    ; yields: result=true
11261      <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
11262      <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
11263
11264.. _i_phi:
11265
11266'``phi``' Instruction
11267^^^^^^^^^^^^^^^^^^^^^
11268
11269Syntax:
11270"""""""
11271
11272::
11273
11274      <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11275
11276Overview:
11277"""""""""
11278
11279The '``phi``' instruction is used to implement the φ node in the SSA
11280graph representing the function.
11281
11282Arguments:
11283""""""""""
11284
11285The type of the incoming values is specified with the first type field.
11286After this, the '``phi``' instruction takes a list of pairs as
11287arguments, with one pair for each predecessor basic block of the current
11288block. Only values of :ref:`first class <t_firstclass>` type may be used as
11289the value arguments to the PHI node. Only labels may be used as the
11290label arguments.
11291
11292There must be no non-phi instructions between the start of a basic block
11293and the PHI instructions: i.e. PHI instructions must be first in a basic
11294block.
11295
11296For the purposes of the SSA form, the use of each incoming value is
11297deemed to occur on the edge from the corresponding predecessor block to
11298the current block (but after any definition of an '``invoke``'
11299instruction's return value on the same edge).
11300
11301The optional ``fast-math-flags`` marker indicates that the phi has one
11302or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11303to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11304are only valid for phis that return a floating-point scalar or vector
11305type, or an array (nested to any depth) of floating-point scalar or vector
11306types.
11307
11308Semantics:
11309""""""""""
11310
11311At runtime, the '``phi``' instruction logically takes on the value
11312specified by the pair corresponding to the predecessor basic block that
11313executed just prior to the current block.
11314
11315Example:
11316""""""""
11317
11318.. code-block:: llvm
11319
11320    Loop:       ; Infinite loop that counts from 0 on up...
11321      %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11322      %nextindvar = add i32 %indvar, 1
11323      br label %Loop
11324
11325.. _i_select:
11326
11327'``select``' Instruction
11328^^^^^^^^^^^^^^^^^^^^^^^^
11329
11330Syntax:
11331"""""""
11332
11333::
11334
11335      <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
11336
11337      selty is either i1 or {<N x i1>}
11338
11339Overview:
11340"""""""""
11341
11342The '``select``' instruction is used to choose one value based on a
11343condition, without IR-level branching.
11344
11345Arguments:
11346""""""""""
11347
11348The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11349values indicating the condition, and two values of the same :ref:`first
11350class <t_firstclass>` type.
11351
11352#. The optional ``fast-math flags`` marker indicates that the select has one or more
11353   :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11354   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11355   for selects that return a floating-point scalar or vector type, or an array
11356   (nested to any depth) of floating-point scalar or vector types.
11357
11358Semantics:
11359""""""""""
11360
11361If the condition is an i1 and it evaluates to 1, the instruction returns
11362the first value argument; otherwise, it returns the second value
11363argument.
11364
11365If the condition is a vector of i1, then the value arguments must be
11366vectors of the same size, and the selection is done element by element.
11367
11368If the condition is an i1 and the value arguments are vectors of the
11369same size, then an entire vector is selected.
11370
11371Example:
11372""""""""
11373
11374.. code-block:: llvm
11375
11376      %X = select i1 true, i8 17, i8 42          ; yields i8:17
11377
11378
11379.. _i_freeze:
11380
11381'``freeze``' Instruction
11382^^^^^^^^^^^^^^^^^^^^^^^^
11383
11384Syntax:
11385"""""""
11386
11387::
11388
11389      <result> = freeze ty <val>    ; yields ty:result
11390
11391Overview:
11392"""""""""
11393
11394The '``freeze``' instruction is used to stop propagation of
11395:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11396
11397Arguments:
11398""""""""""
11399
11400The '``freeze``' instruction takes a single argument.
11401
11402Semantics:
11403""""""""""
11404
11405If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11406arbitrary, but fixed, value of type '``ty``'.
11407Otherwise, this instruction is a no-op and returns the input argument.
11408All uses of a value returned by the same '``freeze``' instruction are
11409guaranteed to always observe the same value, while different '``freeze``'
11410instructions may yield different values.
11411
11412While ``undef`` and ``poison`` pointers can be frozen, the result is a
11413non-dereferenceable pointer. See the
11414:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11415If an aggregate value or vector is frozen, the operand is frozen element-wise.
11416The padding of an aggregate isn't considered, since it isn't visible
11417without storing it into memory and loading it with a different type.
11418
11419
11420Example:
11421""""""""
11422
11423.. code-block:: text
11424
11425      %w = i32 undef
11426      %x = freeze i32 %w
11427      %y = add i32 %w, %w         ; undef
11428      %z = add i32 %x, %x         ; even number because all uses of %x observe
11429                                  ; the same value
11430      %x2 = freeze i32 %w
11431      %cmp = icmp eq i32 %x, %x2  ; can be true or false
11432
11433      ; example with vectors
11434      %v = <2 x i32> <i32 undef, i32 poison>
11435      %a = extractelement <2 x i32> %v, i32 0    ; undef
11436      %b = extractelement <2 x i32> %v, i32 1    ; poison
11437      %add = add i32 %a, %a                      ; undef
11438
11439      %v.fr = freeze <2 x i32> %v                ; element-wise freeze
11440      %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
11441      %add.f = add i32 %d, %d                    ; even number
11442
11443      ; branching on frozen value
11444      %poison = add nsw i1 %k, undef   ; poison
11445      %c = freeze i1 %poison
11446      br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
11447
11448
11449.. _i_call:
11450
11451'``call``' Instruction
11452^^^^^^^^^^^^^^^^^^^^^^
11453
11454Syntax:
11455"""""""
11456
11457::
11458
11459      <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
11460                 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
11461
11462Overview:
11463"""""""""
11464
11465The '``call``' instruction represents a simple function call.
11466
11467Arguments:
11468""""""""""
11469
11470This instruction requires several arguments:
11471
11472#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
11473   should perform tail call optimization. The ``tail`` marker is a hint that
11474   `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
11475   means that the call must be tail call optimized in order for the program to
11476   be correct. The ``musttail`` marker provides these guarantees:
11477
11478   #. The call will not cause unbounded stack growth if it is part of a
11479      recursive cycle in the call graph.
11480   #. Arguments with the :ref:`inalloca <attr_inalloca>` or
11481      :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
11482   #. If the musttail call appears in a function with the ``"thunk"`` attribute
11483      and the caller and callee both have varargs, than any unprototyped
11484      arguments in register or memory are forwarded to the callee. Similarly,
11485      the return value of the callee is returned to the caller's caller, even
11486      if a void return type is in use.
11487
11488   Both markers imply that the callee does not access allocas from the caller.
11489   The ``tail`` marker additionally implies that the callee does not access
11490   varargs from the caller. Calls marked ``musttail`` must obey the following
11491   additional  rules:
11492
11493   - The call must immediately precede a :ref:`ret <i_ret>` instruction,
11494     or a pointer bitcast followed by a ret instruction.
11495   - The ret instruction must return the (possibly bitcasted) value
11496     produced by the call, undef, or void.
11497   - The calling conventions of the caller and callee must match.
11498   - The callee must be varargs iff the caller is varargs. Bitcasting a
11499     non-varargs function to the appropriate varargs type is legal so
11500     long as the non-varargs prefixes obey the other rules.
11501   - The return type must not undergo automatic conversion to an `sret` pointer.
11502
11503  In addition, if the calling convention is not `swifttailcc` or `tailcc`:
11504
11505   - All ABI-impacting function attributes, such as sret, byval, inreg,
11506     returned, and inalloca, must match.
11507   - The caller and callee prototypes must match. Pointer types of parameters
11508     or return types may differ in pointee type, but not in address space.
11509
11510  On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
11511
11512   - Only these ABI-impacting attributes attributes are allowed: sret, byval,
11513     swiftself, and swiftasync.
11514   - Prototypes are not required to match.
11515
11516   Tail call optimization for calls marked ``tail`` is guaranteed to occur if
11517   the following conditions are met:
11518
11519   -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
11520   -  The call is in tail position (ret immediately follows call and ret
11521      uses value of call or is void).
11522   -  Option ``-tailcallopt`` is enabled,
11523      ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
11524      is ``tailcc``
11525   -  `Platform-specific constraints are
11526      met. <CodeGenerator.html#tailcallopt>`_
11527
11528#. The optional ``notail`` marker indicates that the optimizers should not add
11529   ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
11530   call optimization from being performed on the call.
11531
11532#. The optional ``fast-math flags`` marker indicates that the call has one or more
11533   :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11534   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11535   for calls that return a floating-point scalar or vector type, or an array
11536   (nested to any depth) of floating-point scalar or vector types.
11537
11538#. The optional "cconv" marker indicates which :ref:`calling
11539   convention <callingconv>` the call should use. If none is
11540   specified, the call defaults to using C calling conventions. The
11541   calling convention of the call must match the calling convention of
11542   the target function, or else the behavior is undefined.
11543#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
11544   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
11545   are valid here.
11546#. The optional addrspace attribute can be used to indicate the address space
11547   of the called function. If it is not specified, the program address space
11548   from the :ref:`datalayout string<langref_datalayout>` will be used.
11549#. '``ty``': the type of the call instruction itself which is also the
11550   type of the return value. Functions that return no value are marked
11551   ``void``.
11552#. '``fnty``': shall be the signature of the function being called. The
11553   argument types must match the types implied by this signature. This
11554   type can be omitted if the function is not varargs.
11555#. '``fnptrval``': An LLVM value containing a pointer to a function to
11556   be called. In most cases, this is a direct function call, but
11557   indirect ``call``'s are just as possible, calling an arbitrary pointer
11558   to function value.
11559#. '``function args``': argument list whose types match the function
11560   signature argument types and parameter attributes. All arguments must
11561   be of :ref:`first class <t_firstclass>` type. If the function signature
11562   indicates the function accepts a variable number of arguments, the
11563   extra arguments can be specified.
11564#. The optional :ref:`function attributes <fnattrs>` list.
11565#. The optional :ref:`operand bundles <opbundles>` list.
11566
11567Semantics:
11568""""""""""
11569
11570The '``call``' instruction is used to cause control flow to transfer to
11571a specified function, with its incoming arguments bound to the specified
11572values. Upon a '``ret``' instruction in the called function, control
11573flow continues with the instruction after the function call, and the
11574return value of the function is bound to the result argument.
11575
11576Example:
11577""""""""
11578
11579.. code-block:: llvm
11580
11581      %retval = call i32 @test(i32 %argc)
11582      call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
11583      %X = tail call i32 @foo()                                    ; yields i32
11584      %Y = tail call fastcc i32 @foo()  ; yields i32
11585      call void %foo(i8 97 signext)
11586
11587      %struct.A = type { i32, i8 }
11588      %r = call %struct.A @foo()                        ; yields { i32, i8 }
11589      %gr = extractvalue %struct.A %r, 0                ; yields i32
11590      %gr1 = extractvalue %struct.A %r, 1               ; yields i8
11591      %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
11592      %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
11593
11594llvm treats calls to some functions with names and arguments that match
11595the standard C99 library as being the C99 library functions, and may
11596perform optimizations or generate code for them under that assumption.
11597This is something we'd like to change in the future to provide better
11598support for freestanding environments and non-C-based languages.
11599
11600.. _i_va_arg:
11601
11602'``va_arg``' Instruction
11603^^^^^^^^^^^^^^^^^^^^^^^^
11604
11605Syntax:
11606"""""""
11607
11608::
11609
11610      <resultval> = va_arg <va_list*> <arglist>, <argty>
11611
11612Overview:
11613"""""""""
11614
11615The '``va_arg``' instruction is used to access arguments passed through
11616the "variable argument" area of a function call. It is used to implement
11617the ``va_arg`` macro in C.
11618
11619Arguments:
11620""""""""""
11621
11622This instruction takes a ``va_list*`` value and the type of the
11623argument. It returns a value of the specified argument type and
11624increments the ``va_list`` to point to the next argument. The actual
11625type of ``va_list`` is target specific.
11626
11627Semantics:
11628""""""""""
11629
11630The '``va_arg``' instruction loads an argument of the specified type
11631from the specified ``va_list`` and causes the ``va_list`` to point to
11632the next argument. For more information, see the variable argument
11633handling :ref:`Intrinsic Functions <int_varargs>`.
11634
11635It is legal for this instruction to be called in a function which does
11636not take a variable number of arguments, for example, the ``vfprintf``
11637function.
11638
11639``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
11640function <intrinsics>` because it takes a type as an argument.
11641
11642Example:
11643""""""""
11644
11645See the :ref:`variable argument processing <int_varargs>` section.
11646
11647Note that the code generator does not yet fully support va\_arg on many
11648targets. Also, it does not currently support va\_arg with aggregate
11649types on any target.
11650
11651.. _i_landingpad:
11652
11653'``landingpad``' Instruction
11654^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11655
11656Syntax:
11657"""""""
11658
11659::
11660
11661      <resultval> = landingpad <resultty> <clause>+
11662      <resultval> = landingpad <resultty> cleanup <clause>*
11663
11664      <clause> := catch <type> <value>
11665      <clause> := filter <array constant type> <array constant>
11666
11667Overview:
11668"""""""""
11669
11670The '``landingpad``' instruction is used by `LLVM's exception handling
11671system <ExceptionHandling.html#overview>`_ to specify that a basic block
11672is a landing pad --- one where the exception lands, and corresponds to the
11673code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
11674defines values supplied by the :ref:`personality function <personalityfn>` upon
11675re-entry to the function. The ``resultval`` has the type ``resultty``.
11676
11677Arguments:
11678""""""""""
11679
11680The optional
11681``cleanup`` flag indicates that the landing pad block is a cleanup.
11682
11683A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
11684contains the global variable representing the "type" that may be caught
11685or filtered respectively. Unlike the ``catch`` clause, the ``filter``
11686clause takes an array constant as its argument. Use
11687"``[0 x i8**] undef``" for a filter which cannot throw. The
11688'``landingpad``' instruction must contain *at least* one ``clause`` or
11689the ``cleanup`` flag.
11690
11691Semantics:
11692""""""""""
11693
11694The '``landingpad``' instruction defines the values which are set by the
11695:ref:`personality function <personalityfn>` upon re-entry to the function, and
11696therefore the "result type" of the ``landingpad`` instruction. As with
11697calling conventions, how the personality function results are
11698represented in LLVM IR is target specific.
11699
11700The clauses are applied in order from top to bottom. If two
11701``landingpad`` instructions are merged together through inlining, the
11702clauses from the calling function are appended to the list of clauses.
11703When the call stack is being unwound due to an exception being thrown,
11704the exception is compared against each ``clause`` in turn. If it doesn't
11705match any of the clauses, and the ``cleanup`` flag is not set, then
11706unwinding continues further up the call stack.
11707
11708The ``landingpad`` instruction has several restrictions:
11709
11710-  A landing pad block is a basic block which is the unwind destination
11711   of an '``invoke``' instruction.
11712-  A landing pad block must have a '``landingpad``' instruction as its
11713   first non-PHI instruction.
11714-  There can be only one '``landingpad``' instruction within the landing
11715   pad block.
11716-  A basic block that is not a landing pad block may not include a
11717   '``landingpad``' instruction.
11718
11719Example:
11720""""""""
11721
11722.. code-block:: llvm
11723
11724      ;; A landing pad which can catch an integer.
11725      %res = landingpad { i8*, i32 }
11726               catch i8** @_ZTIi
11727      ;; A landing pad that is a cleanup.
11728      %res = landingpad { i8*, i32 }
11729               cleanup
11730      ;; A landing pad which can catch an integer and can only throw a double.
11731      %res = landingpad { i8*, i32 }
11732               catch i8** @_ZTIi
11733               filter [1 x i8**] [@_ZTId]
11734
11735.. _i_catchpad:
11736
11737'``catchpad``' Instruction
11738^^^^^^^^^^^^^^^^^^^^^^^^^^
11739
11740Syntax:
11741"""""""
11742
11743::
11744
11745      <resultval> = catchpad within <catchswitch> [<args>*]
11746
11747Overview:
11748"""""""""
11749
11750The '``catchpad``' instruction is used by `LLVM's exception handling
11751system <ExceptionHandling.html#overview>`_ to specify that a basic block
11752begins a catch handler --- one where a personality routine attempts to transfer
11753control to catch an exception.
11754
11755Arguments:
11756""""""""""
11757
11758The ``catchswitch`` operand must always be a token produced by a
11759:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
11760ensures that each ``catchpad`` has exactly one predecessor block, and it always
11761terminates in a ``catchswitch``.
11762
11763The ``args`` correspond to whatever information the personality routine
11764requires to know if this is an appropriate handler for the exception. Control
11765will transfer to the ``catchpad`` if this is the first appropriate handler for
11766the exception.
11767
11768The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
11769``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
11770pads.
11771
11772Semantics:
11773""""""""""
11774
11775When the call stack is being unwound due to an exception being thrown, the
11776exception is compared against the ``args``. If it doesn't match, control will
11777not reach the ``catchpad`` instruction.  The representation of ``args`` is
11778entirely target and personality function-specific.
11779
11780Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
11781instruction must be the first non-phi of its parent basic block.
11782
11783The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
11784instructions is described in the
11785`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
11786
11787When a ``catchpad`` has been "entered" but not yet "exited" (as
11788described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11789it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11790that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11791
11792Example:
11793""""""""
11794
11795.. code-block:: text
11796
11797    dispatch:
11798      %cs = catchswitch within none [label %handler0] unwind to caller
11799      ;; A catch block which can catch an integer.
11800    handler0:
11801      %tok = catchpad within %cs [i8** @_ZTIi]
11802
11803.. _i_cleanuppad:
11804
11805'``cleanuppad``' Instruction
11806^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11807
11808Syntax:
11809"""""""
11810
11811::
11812
11813      <resultval> = cleanuppad within <parent> [<args>*]
11814
11815Overview:
11816"""""""""
11817
11818The '``cleanuppad``' instruction is used by `LLVM's exception handling
11819system <ExceptionHandling.html#overview>`_ to specify that a basic block
11820is a cleanup block --- one where a personality routine attempts to
11821transfer control to run cleanup actions.
11822The ``args`` correspond to whatever additional
11823information the :ref:`personality function <personalityfn>` requires to
11824execute the cleanup.
11825The ``resultval`` has the type :ref:`token <t_token>` and is used to
11826match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
11827The ``parent`` argument is the token of the funclet that contains the
11828``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
11829this operand may be the token ``none``.
11830
11831Arguments:
11832""""""""""
11833
11834The instruction takes a list of arbitrary values which are interpreted
11835by the :ref:`personality function <personalityfn>`.
11836
11837Semantics:
11838""""""""""
11839
11840When the call stack is being unwound due to an exception being thrown,
11841the :ref:`personality function <personalityfn>` transfers control to the
11842``cleanuppad`` with the aid of the personality-specific arguments.
11843As with calling conventions, how the personality function results are
11844represented in LLVM IR is target specific.
11845
11846The ``cleanuppad`` instruction has several restrictions:
11847
11848-  A cleanup block is a basic block which is the unwind destination of
11849   an exceptional instruction.
11850-  A cleanup block must have a '``cleanuppad``' instruction as its
11851   first non-PHI instruction.
11852-  There can be only one '``cleanuppad``' instruction within the
11853   cleanup block.
11854-  A basic block that is not a cleanup block may not include a
11855   '``cleanuppad``' instruction.
11856
11857When a ``cleanuppad`` has been "entered" but not yet "exited" (as
11858described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11859it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11860that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11861
11862Example:
11863""""""""
11864
11865.. code-block:: text
11866
11867      %tok = cleanuppad within %cs []
11868
11869.. _intrinsics:
11870
11871Intrinsic Functions
11872===================
11873
11874LLVM supports the notion of an "intrinsic function". These functions
11875have well known names and semantics and are required to follow certain
11876restrictions. Overall, these intrinsics represent an extension mechanism
11877for the LLVM language that does not require changing all of the
11878transformations in LLVM when adding to the language (or the bitcode
11879reader/writer, the parser, etc...).
11880
11881Intrinsic function names must all start with an "``llvm.``" prefix. This
11882prefix is reserved in LLVM for intrinsic names; thus, function names may
11883not begin with this prefix. Intrinsic functions must always be external
11884functions: you cannot define the body of intrinsic functions. Intrinsic
11885functions may only be used in call or invoke instructions: it is illegal
11886to take the address of an intrinsic function. Additionally, because
11887intrinsic functions are part of the LLVM language, it is required if any
11888are added that they be documented here.
11889
11890Some intrinsic functions can be overloaded, i.e., the intrinsic
11891represents a family of functions that perform the same operation but on
11892different data types. Because LLVM can represent over 8 million
11893different integer types, overloading is used commonly to allow an
11894intrinsic function to operate on any integer type. One or more of the
11895argument types or the result type can be overloaded to accept any
11896integer type. Argument types may also be defined as exactly matching a
11897previous argument's type or the result type. This allows an intrinsic
11898function which accepts multiple arguments, but needs all of them to be
11899of the same type, to only be overloaded with respect to a single
11900argument or the result.
11901
11902Overloaded intrinsics will have the names of its overloaded argument
11903types encoded into its function name, each preceded by a period. Only
11904those types which are overloaded result in a name suffix. Arguments
11905whose type is matched against another type do not. For example, the
11906``llvm.ctpop`` function can take an integer of any width and returns an
11907integer of exactly the same integer width. This leads to a family of
11908functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
11909``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
11910overloaded, and only one type suffix is required. Because the argument's
11911type is matched against the return type, it does not require its own
11912name suffix.
11913
11914:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
11915that depend on an unnamed type in one of its overloaded argument types get an
11916additional ``.<number>`` suffix. This allows differentiating intrinsics with
11917different unnamed types as arguments. (For example:
11918``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
11919it ensures unique names in the module. While linking together two modules, it is
11920still possible to get a name clash. In that case one of the names will be
11921changed by getting a new number.
11922
11923For target developers who are defining intrinsics for back-end code
11924generation, any intrinsic overloads based solely the distinction between
11925integer or floating point types should not be relied upon for correct
11926code generation. In such cases, the recommended approach for target
11927maintainers when defining intrinsics is to create separate integer and
11928FP intrinsics rather than rely on overloading. For example, if different
11929codegen is required for ``llvm.target.foo(<4 x i32>)`` and
11930``llvm.target.foo(<4 x float>)`` then these should be split into
11931different intrinsics.
11932
11933To learn how to add an intrinsic function, please see the `Extending
11934LLVM Guide <ExtendingLLVM.html>`_.
11935
11936.. _int_varargs:
11937
11938Variable Argument Handling Intrinsics
11939-------------------------------------
11940
11941Variable argument support is defined in LLVM with the
11942:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
11943functions. These functions are related to the similarly named macros
11944defined in the ``<stdarg.h>`` header file.
11945
11946All of these functions operate on arguments that use a target-specific
11947value type "``va_list``". The LLVM assembly language reference manual
11948does not define what this type is, so all transformations should be
11949prepared to handle these functions regardless of the type used.
11950
11951This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
11952variable argument handling intrinsic functions are used.
11953
11954.. code-block:: llvm
11955
11956    ; This struct is different for every platform. For most platforms,
11957    ; it is merely an i8*.
11958    %struct.va_list = type { i8* }
11959
11960    ; For Unix x86_64 platforms, va_list is the following struct:
11961    ; %struct.va_list = type { i32, i32, i8*, i8* }
11962
11963    define i32 @test(i32 %X, ...) {
11964      ; Initialize variable argument processing
11965      %ap = alloca %struct.va_list
11966      %ap2 = bitcast %struct.va_list* %ap to i8*
11967      call void @llvm.va_start(i8* %ap2)
11968
11969      ; Read a single integer argument
11970      %tmp = va_arg i8* %ap2, i32
11971
11972      ; Demonstrate usage of llvm.va_copy and llvm.va_end
11973      %aq = alloca i8*
11974      %aq2 = bitcast i8** %aq to i8*
11975      call void @llvm.va_copy(i8* %aq2, i8* %ap2)
11976      call void @llvm.va_end(i8* %aq2)
11977
11978      ; Stop processing of arguments.
11979      call void @llvm.va_end(i8* %ap2)
11980      ret i32 %tmp
11981    }
11982
11983    declare void @llvm.va_start(i8*)
11984    declare void @llvm.va_copy(i8*, i8*)
11985    declare void @llvm.va_end(i8*)
11986
11987.. _int_va_start:
11988
11989'``llvm.va_start``' Intrinsic
11990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11991
11992Syntax:
11993"""""""
11994
11995::
11996
11997      declare void @llvm.va_start(i8* <arglist>)
11998
11999Overview:
12000"""""""""
12001
12002The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
12003subsequent use by ``va_arg``.
12004
12005Arguments:
12006""""""""""
12007
12008The argument is a pointer to a ``va_list`` element to initialize.
12009
12010Semantics:
12011""""""""""
12012
12013The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12014available in C. In a target-dependent way, it initializes the
12015``va_list`` element to which the argument points, so that the next call
12016to ``va_arg`` will produce the first variable argument passed to the
12017function. Unlike the C ``va_start`` macro, this intrinsic does not need
12018to know the last argument of the function as the compiler can figure
12019that out.
12020
12021'``llvm.va_end``' Intrinsic
12022^^^^^^^^^^^^^^^^^^^^^^^^^^^
12023
12024Syntax:
12025"""""""
12026
12027::
12028
12029      declare void @llvm.va_end(i8* <arglist>)
12030
12031Overview:
12032"""""""""
12033
12034The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
12035initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12036
12037Arguments:
12038""""""""""
12039
12040The argument is a pointer to a ``va_list`` to destroy.
12041
12042Semantics:
12043""""""""""
12044
12045The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12046available in C. In a target-dependent way, it destroys the ``va_list``
12047element to which the argument points. Calls to
12048:ref:`llvm.va_start <int_va_start>` and
12049:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12050``llvm.va_end``.
12051
12052.. _int_va_copy:
12053
12054'``llvm.va_copy``' Intrinsic
12055^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12056
12057Syntax:
12058"""""""
12059
12060::
12061
12062      declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
12063
12064Overview:
12065"""""""""
12066
12067The '``llvm.va_copy``' intrinsic copies the current argument position
12068from the source argument list to the destination argument list.
12069
12070Arguments:
12071""""""""""
12072
12073The first argument is a pointer to a ``va_list`` element to initialize.
12074The second argument is a pointer to a ``va_list`` element to copy from.
12075
12076Semantics:
12077""""""""""
12078
12079The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12080available in C. In a target-dependent way, it copies the source
12081``va_list`` element into the destination ``va_list`` element. This
12082intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12083arbitrarily complex and require, for example, memory allocation.
12084
12085Accurate Garbage Collection Intrinsics
12086--------------------------------------
12087
12088LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12089(GC) requires the frontend to generate code containing appropriate intrinsic
12090calls and select an appropriate GC strategy which knows how to lower these
12091intrinsics in a manner which is appropriate for the target collector.
12092
12093These intrinsics allow identification of :ref:`GC roots on the
12094stack <int_gcroot>`, as well as garbage collector implementations that
12095require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12096Frontends for type-safe garbage collected languages should generate
12097these intrinsics to make use of the LLVM garbage collectors. For more
12098details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12099
12100LLVM provides an second experimental set of intrinsics for describing garbage
12101collection safepoints in compiled code. These intrinsics are an alternative
12102to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12103:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12104differences in approach are covered in the `Garbage Collection with LLVM
12105<GarbageCollection.html>`_ documentation. The intrinsics themselves are
12106described in :doc:`Statepoints`.
12107
12108.. _int_gcroot:
12109
12110'``llvm.gcroot``' Intrinsic
12111^^^^^^^^^^^^^^^^^^^^^^^^^^^
12112
12113Syntax:
12114"""""""
12115
12116::
12117
12118      declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
12119
12120Overview:
12121"""""""""
12122
12123The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12124the code generator, and allows some metadata to be associated with it.
12125
12126Arguments:
12127""""""""""
12128
12129The first argument specifies the address of a stack object that contains
12130the root pointer. The second pointer (which must be either a constant or
12131a global value address) contains the meta-data to be associated with the
12132root.
12133
12134Semantics:
12135""""""""""
12136
12137At runtime, a call to this intrinsic stores a null pointer into the
12138"ptrloc" location. At compile-time, the code generator generates
12139information to allow the runtime to find the pointer at GC safe points.
12140The '``llvm.gcroot``' intrinsic may only be used in a function which
12141:ref:`specifies a GC algorithm <gc>`.
12142
12143.. _int_gcread:
12144
12145'``llvm.gcread``' Intrinsic
12146^^^^^^^^^^^^^^^^^^^^^^^^^^^
12147
12148Syntax:
12149"""""""
12150
12151::
12152
12153      declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
12154
12155Overview:
12156"""""""""
12157
12158The '``llvm.gcread``' intrinsic identifies reads of references from heap
12159locations, allowing garbage collector implementations that require read
12160barriers.
12161
12162Arguments:
12163""""""""""
12164
12165The second argument is the address to read from, which should be an
12166address allocated from the garbage collector. The first object is a
12167pointer to the start of the referenced object, if needed by the language
12168runtime (otherwise null).
12169
12170Semantics:
12171""""""""""
12172
12173The '``llvm.gcread``' intrinsic has the same semantics as a load
12174instruction, but may be replaced with substantially more complex code by
12175the garbage collector runtime, as needed. The '``llvm.gcread``'
12176intrinsic may only be used in a function which :ref:`specifies a GC
12177algorithm <gc>`.
12178
12179.. _int_gcwrite:
12180
12181'``llvm.gcwrite``' Intrinsic
12182^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12183
12184Syntax:
12185"""""""
12186
12187::
12188
12189      declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
12190
12191Overview:
12192"""""""""
12193
12194The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12195locations, allowing garbage collector implementations that require write
12196barriers (such as generational or reference counting collectors).
12197
12198Arguments:
12199""""""""""
12200
12201The first argument is the reference to store, the second is the start of
12202the object to store it to, and the third is the address of the field of
12203Obj to store to. If the runtime does not require a pointer to the
12204object, Obj may be null.
12205
12206Semantics:
12207""""""""""
12208
12209The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12210instruction, but may be replaced with substantially more complex code by
12211the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12212intrinsic may only be used in a function which :ref:`specifies a GC
12213algorithm <gc>`.
12214
12215
12216.. _gc_statepoint:
12217
12218'llvm.experimental.gc.statepoint' Intrinsic
12219^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12220
12221Syntax:
12222"""""""
12223
12224::
12225
12226      declare token
12227        @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12228                       func_type <target>,
12229                       i64 <#call args>, i64 <flags>,
12230                       ... (call parameters),
12231                       i64 0, i64 0)
12232
12233Overview:
12234"""""""""
12235
12236The statepoint intrinsic represents a call which is parse-able by the
12237runtime.
12238
12239Operands:
12240"""""""""
12241
12242The 'id' operand is a constant integer that is reported as the ID
12243field in the generated stackmap.  LLVM does not interpret this
12244parameter in any way and its meaning is up to the statepoint user to
12245decide.  Note that LLVM is free to duplicate code containing
12246statepoint calls, and this may transform IR that had a unique 'id' per
12247lexical call to statepoint to IR that does not.
12248
12249If 'num patch bytes' is non-zero then the call instruction
12250corresponding to the statepoint is not emitted and LLVM emits 'num
12251patch bytes' bytes of nops in its place.  LLVM will emit code to
12252prepare the function arguments and retrieve the function return value
12253in accordance to the calling convention; the former before the nop
12254sequence and the latter after the nop sequence.  It is expected that
12255the user will patch over the 'num patch bytes' bytes of nops with a
12256calling sequence specific to their runtime before executing the
12257generated machine code.  There are no guarantees with respect to the
12258alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
12259not have a concept of shadow bytes.  Note that semantically the
12260statepoint still represents a call or invoke to 'target', and the nop
12261sequence after patching is expected to represent an operation
12262equivalent to a call or invoke to 'target'.
12263
12264The 'target' operand is the function actually being called.  The
12265target can be specified as either a symbolic LLVM function, or as an
12266arbitrary Value of appropriate function type.  Note that the function
12267type must match the signature of the callee and the types of the 'call
12268parameters' arguments.
12269
12270The '#call args' operand is the number of arguments to the actual
12271call.  It must exactly match the number of arguments passed in the
12272'call parameters' variable length section.
12273
12274The 'flags' operand is used to specify extra information about the
12275statepoint. This is currently only used to mark certain statepoints
12276as GC transitions. This operand is a 64-bit integer with the following
12277layout, where bit 0 is the least significant bit:
12278
12279  +-------+---------------------------------------------------+
12280  | Bit # | Usage                                             |
12281  +=======+===================================================+
12282  |     0 | Set if the statepoint is a GC transition, cleared |
12283  |       | otherwise.                                        |
12284  +-------+---------------------------------------------------+
12285  |  1-63 | Reserved for future use; must be cleared.         |
12286  +-------+---------------------------------------------------+
12287
12288The 'call parameters' arguments are simply the arguments which need to
12289be passed to the call target.  They will be lowered according to the
12290specified calling convention and otherwise handled like a normal call
12291instruction.  The number of arguments must exactly match what is
12292specified in '# call args'.  The types must match the signature of
12293'target'.
12294
12295The 'call parameter' attributes must be followed by two 'i64 0' constants.
12296These were originally the length prefixes for 'gc transition parameter' and
12297'deopt parameter' arguments, but the role of these parameter sets have been
12298entirely replaced with the corresponding operand bundles.  In a future
12299revision, these now redundant arguments will be removed.
12300
12301Semantics:
12302""""""""""
12303
12304A statepoint is assumed to read and write all memory.  As a result,
12305memory operations can not be reordered past a statepoint.  It is
12306illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12307
12308Note that legal IR can not perform any memory operation on a 'gc
12309pointer' argument of the statepoint in a location statically reachable
12310from the statepoint.  Instead, the explicitly relocated value (from a
12311``gc.relocate``) must be used.
12312
12313'llvm.experimental.gc.result' Intrinsic
12314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12315
12316Syntax:
12317"""""""
12318
12319::
12320
12321      declare type*
12322        @llvm.experimental.gc.result(token %statepoint_token)
12323
12324Overview:
12325"""""""""
12326
12327``gc.result`` extracts the result of the original call instruction
12328which was replaced by the ``gc.statepoint``.  The ``gc.result``
12329intrinsic is actually a family of three intrinsics due to an
12330implementation limitation.  Other than the type of the return value,
12331the semantics are the same.
12332
12333Operands:
12334"""""""""
12335
12336The first and only argument is the ``gc.statepoint`` which starts
12337the safepoint sequence of which this ``gc.result`` is a part.
12338Despite the typing of this as a generic token, *only* the value defined
12339by a ``gc.statepoint`` is legal here.
12340
12341Semantics:
12342""""""""""
12343
12344The ``gc.result`` represents the return value of the call target of
12345the ``statepoint``.  The type of the ``gc.result`` must exactly match
12346the type of the target.  If the call target returns void, there will
12347be no ``gc.result``.
12348
12349A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
12350side effects since it is just a projection of the return value of the
12351previous call represented by the ``gc.statepoint``.
12352
12353'llvm.experimental.gc.relocate' Intrinsic
12354^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12355
12356Syntax:
12357"""""""
12358
12359::
12360
12361      declare <pointer type>
12362        @llvm.experimental.gc.relocate(token %statepoint_token,
12363                                       i32 %base_offset,
12364                                       i32 %pointer_offset)
12365
12366Overview:
12367"""""""""
12368
12369A ``gc.relocate`` returns the potentially relocated value of a pointer
12370at the safepoint.
12371
12372Operands:
12373"""""""""
12374
12375The first argument is the ``gc.statepoint`` which starts the
12376safepoint sequence of which this ``gc.relocation`` is a part.
12377Despite the typing of this as a generic token, *only* the value defined
12378by a ``gc.statepoint`` is legal here.
12379
12380The second and third arguments are both indices into operands of the
12381corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12382
12383The second argument is an index which specifies the allocation for the pointer
12384being relocated. The associated value must be within the object with which the
12385pointer being relocated is associated. The optimizer is free to change *which*
12386interior derived pointer is reported, provided that it does not replace an
12387actual base pointer with another interior derived pointer. Collectors are
12388allowed to rely on the base pointer operand remaining an actual base pointer if
12389so constructed.
12390
12391The third argument is an index which specify the (potentially) derived pointer
12392being relocated.  It is legal for this index to be the same as the second
12393argument if-and-only-if a base pointer is being relocated.
12394
12395Semantics:
12396""""""""""
12397
12398The return value of ``gc.relocate`` is the potentially relocated value
12399of the pointer specified by its arguments.  It is unspecified how the
12400value of the returned pointer relates to the argument to the
12401``gc.statepoint`` other than that a) it points to the same source
12402language object with the same offset, and b) the 'based-on'
12403relationship of the newly relocated pointers is a projection of the
12404unrelocated pointers.  In particular, the integer value of the pointer
12405returned is unspecified.
12406
12407A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
12408side effects since it is just a way to extract information about work
12409done during the actual call modeled by the ``gc.statepoint``.
12410
12411.. _gc.get.pointer.base:
12412
12413'llvm.experimental.gc.get.pointer.base' Intrinsic
12414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12415
12416Syntax:
12417"""""""
12418
12419::
12420
12421      declare <pointer type>
12422        @llvm.experimental.gc.get.pointer.base(
12423          <pointer type> readnone nocapture %derived_ptr)
12424          nounwind readnone willreturn
12425
12426Overview:
12427"""""""""
12428
12429``gc.get.pointer.base`` for a derived pointer returns its base pointer.
12430
12431Operands:
12432"""""""""
12433
12434The only argument is a pointer which is based on some object with
12435an unknown offset from the base of said object.
12436
12437Semantics:
12438""""""""""
12439
12440This intrinsic is used in the abstract machine model for GC to represent
12441the base pointer for an arbitrary derived pointer.
12442
12443This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12444replacing all uses of this callsite with the offset of a derived pointer from
12445its base pointer value. The replacement is done as part of the lowering to the
12446explicit statepoint model.
12447
12448The return pointer type must be the same as the type of the parameter.
12449
12450
12451'llvm.experimental.gc.get.pointer.offset' Intrinsic
12452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12453
12454Syntax:
12455"""""""
12456
12457::
12458
12459      declare i64
12460        @llvm.experimental.gc.get.pointer.offset(
12461          <pointer type> readnone nocapture %derived_ptr)
12462          nounwind readnone willreturn
12463
12464Overview:
12465"""""""""
12466
12467``gc.get.pointer.offset`` for a derived pointer returns the offset from its
12468base pointer.
12469
12470Operands:
12471"""""""""
12472
12473The only argument is a pointer which is based on some object with
12474an unknown offset from the base of said object.
12475
12476Semantics:
12477""""""""""
12478
12479This intrinsic is used in the abstract machine model for GC to represent
12480the offset of an arbitrary derived pointer from its base pointer.
12481
12482This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12483replacing all uses of this callsite with the offset of a derived pointer from
12484its base pointer value. The replacement is done as part of the lowering to the
12485explicit statepoint model.
12486
12487Basically this call calculates difference between the derived pointer and its
12488base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
12489this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
12490in the pointers lost for further lowering from the abstract model to the
12491explicit physical one.
12492
12493Code Generator Intrinsics
12494-------------------------
12495
12496These intrinsics are provided by LLVM to expose special features that
12497may only be implemented with code generator support.
12498
12499'``llvm.returnaddress``' Intrinsic
12500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12501
12502Syntax:
12503"""""""
12504
12505::
12506
12507      declare i8* @llvm.returnaddress(i32 <level>)
12508
12509Overview:
12510"""""""""
12511
12512The '``llvm.returnaddress``' intrinsic attempts to compute a
12513target-specific value indicating the return address of the current
12514function or one of its callers.
12515
12516Arguments:
12517""""""""""
12518
12519The argument to this intrinsic indicates which function to return the
12520address for. Zero indicates the calling function, one indicates its
12521caller, etc. The argument is **required** to be a constant integer
12522value.
12523
12524Semantics:
12525""""""""""
12526
12527The '``llvm.returnaddress``' intrinsic either returns a pointer
12528indicating the return address of the specified call frame, or zero if it
12529cannot be identified. The value returned by this intrinsic is likely to
12530be incorrect or 0 for arguments other than zero, so it should only be
12531used for debugging purposes.
12532
12533Note that calling this intrinsic does not prevent function inlining or
12534other aggressive transformations, so the value returned may not be that
12535of the obvious source-language caller.
12536
12537'``llvm.addressofreturnaddress``' Intrinsic
12538^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12539
12540Syntax:
12541"""""""
12542
12543::
12544
12545      declare i8* @llvm.addressofreturnaddress()
12546
12547Overview:
12548"""""""""
12549
12550The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
12551pointer to the place in the stack frame where the return address of the
12552current function is stored.
12553
12554Semantics:
12555""""""""""
12556
12557Note that calling this intrinsic does not prevent function inlining or
12558other aggressive transformations, so the value returned may not be that
12559of the obvious source-language caller.
12560
12561This intrinsic is only implemented for x86 and aarch64.
12562
12563'``llvm.sponentry``' Intrinsic
12564^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12565
12566Syntax:
12567"""""""
12568
12569::
12570
12571      declare i8* @llvm.sponentry()
12572
12573Overview:
12574"""""""""
12575
12576The '``llvm.sponentry``' intrinsic returns the stack pointer value at
12577the entry of the current function calling this intrinsic.
12578
12579Semantics:
12580""""""""""
12581
12582Note this intrinsic is only verified on AArch64.
12583
12584'``llvm.frameaddress``' Intrinsic
12585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12586
12587Syntax:
12588"""""""
12589
12590::
12591
12592      declare i8* @llvm.frameaddress(i32 <level>)
12593
12594Overview:
12595"""""""""
12596
12597The '``llvm.frameaddress``' intrinsic attempts to return the
12598target-specific frame pointer value for the specified stack frame.
12599
12600Arguments:
12601""""""""""
12602
12603The argument to this intrinsic indicates which function to return the
12604frame pointer for. Zero indicates the calling function, one indicates
12605its caller, etc. The argument is **required** to be a constant integer
12606value.
12607
12608Semantics:
12609""""""""""
12610
12611The '``llvm.frameaddress``' intrinsic either returns a pointer
12612indicating the frame address of the specified call frame, or zero if it
12613cannot be identified. The value returned by this intrinsic is likely to
12614be incorrect or 0 for arguments other than zero, so it should only be
12615used for debugging purposes.
12616
12617Note that calling this intrinsic does not prevent function inlining or
12618other aggressive transformations, so the value returned may not be that
12619of the obvious source-language caller.
12620
12621'``llvm.swift.async.context.addr``' Intrinsic
12622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12623
12624Syntax:
12625"""""""
12626
12627::
12628
12629      declare i8** @llvm.swift.async.context.addr()
12630
12631Overview:
12632"""""""""
12633
12634The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
12635the part of the extended frame record containing the asynchronous
12636context of a Swift execution.
12637
12638Semantics:
12639""""""""""
12640
12641If the caller has a ``swiftasync`` parameter, that argument will initially
12642be stored at the returned address. If not, it will be initialized to null.
12643
12644'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
12645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12646
12647Syntax:
12648"""""""
12649
12650::
12651
12652      declare void @llvm.localescape(...)
12653      declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
12654
12655Overview:
12656"""""""""
12657
12658The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
12659allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
12660live frame pointer to recover the address of the allocation. The offset is
12661computed during frame layout of the caller of ``llvm.localescape``.
12662
12663Arguments:
12664""""""""""
12665
12666All arguments to '``llvm.localescape``' must be pointers to static allocas or
12667casts of static allocas. Each function can only call '``llvm.localescape``'
12668once, and it can only do so from the entry block.
12669
12670The ``func`` argument to '``llvm.localrecover``' must be a constant
12671bitcasted pointer to a function defined in the current module. The code
12672generator cannot determine the frame allocation offset of functions defined in
12673other modules.
12674
12675The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
12676call frame that is currently live. The return value of '``llvm.localaddress``'
12677is one way to produce such a value, but various runtimes also expose a suitable
12678pointer in platform-specific ways.
12679
12680The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
12681'``llvm.localescape``' to recover. It is zero-indexed.
12682
12683Semantics:
12684""""""""""
12685
12686These intrinsics allow a group of functions to share access to a set of local
12687stack allocations of a one parent function. The parent function may call the
12688'``llvm.localescape``' intrinsic once from the function entry block, and the
12689child functions can use '``llvm.localrecover``' to access the escaped allocas.
12690The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
12691the escaped allocas are allocated, which would break attempts to use
12692'``llvm.localrecover``'.
12693
12694'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
12695^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12696
12697Syntax:
12698"""""""
12699
12700::
12701
12702      declare void @llvm.seh.try.begin()
12703      declare void @llvm.seh.try.end()
12704
12705Overview:
12706"""""""""
12707
12708The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
12709the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
12710
12711Semantics:
12712""""""""""
12713
12714When a C-function is compiled with Windows SEH Asynchrous Exception option,
12715-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
12716boundary and to prevent potential exceptions from being moved across boundary.
12717Any set of operations can then be confined to the region by reading their leaf
12718inputs via volatile loads and writing their root outputs via volatile stores.
12719
12720'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
12721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12722
12723Syntax:
12724"""""""
12725
12726::
12727
12728      declare void @llvm.seh.scope.begin()
12729      declare void @llvm.seh.scope.end()
12730
12731Overview:
12732"""""""""
12733
12734The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
12735the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
12736Handling (MSVC option -EHa).
12737
12738Semantics:
12739""""""""""
12740
12741LLVM's ordinary exception-handling representation associates EH cleanups and
12742handlers only with ``invoke``s, which normally correspond only to call sites.  To
12743support arbitrary faulting instructions, it must be possible to recover the current
12744EH scope for any instruction.  Turning every operation in LLVM that could fault
12745into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
12746large number of intrinsics, impede optimization of those operations, and make
12747compilation slower by introducing many extra basic blocks.  These intrinsics can
12748be used instead to mark the region protected by a cleanup, such as for a local
12749C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
12750the start of the region; it is always called with ``invoke``, with the unwind block
12751being the desired unwind destination for any potentially-throwing instructions
12752within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
12753and the EH cleanup is no longer required (e.g. because the destructor is being
12754called).
12755
12756.. _int_read_register:
12757.. _int_read_volatile_register:
12758.. _int_write_register:
12759
12760'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
12761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12762
12763Syntax:
12764"""""""
12765
12766::
12767
12768      declare i32 @llvm.read_register.i32(metadata)
12769      declare i64 @llvm.read_register.i64(metadata)
12770      declare i32 @llvm.read_volatile_register.i32(metadata)
12771      declare i64 @llvm.read_volatile_register.i64(metadata)
12772      declare void @llvm.write_register.i32(metadata, i32 @value)
12773      declare void @llvm.write_register.i64(metadata, i64 @value)
12774      !0 = !{!"sp\00"}
12775
12776Overview:
12777"""""""""
12778
12779The '``llvm.read_register``', '``llvm.read_volatile_register``', and
12780'``llvm.write_register``' intrinsics provide access to the named register.
12781The register must be valid on the architecture being compiled to. The type
12782needs to be compatible with the register being read.
12783
12784Semantics:
12785""""""""""
12786
12787The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
12788return the current value of the register, where possible. The
12789'``llvm.write_register``' intrinsic sets the current value of the register,
12790where possible.
12791
12792A call to '``llvm.read_volatile_register``' is assumed to have side-effects
12793and possibly return a different value each time (e.g. for a timer register).
12794
12795This is useful to implement named register global variables that need
12796to always be mapped to a specific register, as is common practice on
12797bare-metal programs including OS kernels.
12798
12799The compiler doesn't check for register availability or use of the used
12800register in surrounding code, including inline assembly. Because of that,
12801allocatable registers are not supported.
12802
12803Warning: So far it only works with the stack pointer on selected
12804architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
12805work is needed to support other registers and even more so, allocatable
12806registers.
12807
12808.. _int_stacksave:
12809
12810'``llvm.stacksave``' Intrinsic
12811^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12812
12813Syntax:
12814"""""""
12815
12816::
12817
12818      declare i8* @llvm.stacksave()
12819
12820Overview:
12821"""""""""
12822
12823The '``llvm.stacksave``' intrinsic is used to remember the current state
12824of the function stack, for use with
12825:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
12826implementing language features like scoped automatic variable sized
12827arrays in C99.
12828
12829Semantics:
12830""""""""""
12831
12832This intrinsic returns an opaque pointer value that can be passed to
12833:ref:`llvm.stackrestore <int_stackrestore>`. When an
12834``llvm.stackrestore`` intrinsic is executed with a value saved from
12835``llvm.stacksave``, it effectively restores the state of the stack to
12836the state it was in when the ``llvm.stacksave`` intrinsic executed. In
12837practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
12838were allocated after the ``llvm.stacksave`` was executed.
12839
12840.. _int_stackrestore:
12841
12842'``llvm.stackrestore``' Intrinsic
12843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12844
12845Syntax:
12846"""""""
12847
12848::
12849
12850      declare void @llvm.stackrestore(i8* %ptr)
12851
12852Overview:
12853"""""""""
12854
12855The '``llvm.stackrestore``' intrinsic is used to restore the state of
12856the function stack to the state it was in when the corresponding
12857:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
12858useful for implementing language features like scoped automatic variable
12859sized arrays in C99.
12860
12861Semantics:
12862""""""""""
12863
12864See the description for :ref:`llvm.stacksave <int_stacksave>`.
12865
12866.. _int_get_dynamic_area_offset:
12867
12868'``llvm.get.dynamic.area.offset``' Intrinsic
12869^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12870
12871Syntax:
12872"""""""
12873
12874::
12875
12876      declare i32 @llvm.get.dynamic.area.offset.i32()
12877      declare i64 @llvm.get.dynamic.area.offset.i64()
12878
12879Overview:
12880"""""""""
12881
12882      The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
12883      get the offset from native stack pointer to the address of the most
12884      recent dynamic alloca on the caller's stack. These intrinsics are
12885      intended for use in combination with
12886      :ref:`llvm.stacksave <int_stacksave>` to get a
12887      pointer to the most recent dynamic alloca. This is useful, for example,
12888      for AddressSanitizer's stack unpoisoning routines.
12889
12890Semantics:
12891""""""""""
12892
12893      These intrinsics return a non-negative integer value that can be used to
12894      get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
12895      on the caller's stack. In particular, for targets where stack grows downwards,
12896      adding this offset to the native stack pointer would get the address of the most
12897      recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
12898      complicated, because subtracting this value from stack pointer would get the address
12899      one past the end of the most recent dynamic alloca.
12900
12901      Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12902      returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
12903      compile-time-known constant value.
12904
12905      The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12906      must match the target's default address space's (address space 0) pointer type.
12907
12908'``llvm.prefetch``' Intrinsic
12909^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12910
12911Syntax:
12912"""""""
12913
12914::
12915
12916      declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
12917
12918Overview:
12919"""""""""
12920
12921The '``llvm.prefetch``' intrinsic is a hint to the code generator to
12922insert a prefetch instruction if supported; otherwise, it is a noop.
12923Prefetches have no effect on the behavior of the program but can change
12924its performance characteristics.
12925
12926Arguments:
12927""""""""""
12928
12929``address`` is the address to be prefetched, ``rw`` is the specifier
12930determining if the fetch should be for a read (0) or write (1), and
12931``locality`` is a temporal locality specifier ranging from (0) - no
12932locality, to (3) - extremely local keep in cache. The ``cache type``
12933specifies whether the prefetch is performed on the data (1) or
12934instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
12935arguments must be constant integers.
12936
12937Semantics:
12938""""""""""
12939
12940This intrinsic does not modify the behavior of the program. In
12941particular, prefetches cannot trap and do not produce a value. On
12942targets that support this intrinsic, the prefetch can provide hints to
12943the processor cache for better performance.
12944
12945'``llvm.pcmarker``' Intrinsic
12946^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12947
12948Syntax:
12949"""""""
12950
12951::
12952
12953      declare void @llvm.pcmarker(i32 <id>)
12954
12955Overview:
12956"""""""""
12957
12958The '``llvm.pcmarker``' intrinsic is a method to export a Program
12959Counter (PC) in a region of code to simulators and other tools. The
12960method is target specific, but it is expected that the marker will use
12961exported symbols to transmit the PC of the marker. The marker makes no
12962guarantees that it will remain with any specific instruction after
12963optimizations. It is possible that the presence of a marker will inhibit
12964optimizations. The intended use is to be inserted after optimizations to
12965allow correlations of simulation runs.
12966
12967Arguments:
12968""""""""""
12969
12970``id`` is a numerical id identifying the marker.
12971
12972Semantics:
12973""""""""""
12974
12975This intrinsic does not modify the behavior of the program. Backends
12976that do not support this intrinsic may ignore it.
12977
12978'``llvm.readcyclecounter``' Intrinsic
12979^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12980
12981Syntax:
12982"""""""
12983
12984::
12985
12986      declare i64 @llvm.readcyclecounter()
12987
12988Overview:
12989"""""""""
12990
12991The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
12992counter register (or similar low latency, high accuracy clocks) on those
12993targets that support it. On X86, it should map to RDTSC. On Alpha, it
12994should map to RPCC. As the backing counters overflow quickly (on the
12995order of 9 seconds on alpha), this should only be used for small
12996timings.
12997
12998Semantics:
12999""""""""""
13000
13001When directly supported, reading the cycle counter should not modify any
13002memory. Implementations are allowed to either return an application
13003specific value or a system wide value. On backends without support, this
13004is lowered to a constant 0.
13005
13006Note that runtime support may be conditional on the privilege-level code is
13007running at and the host platform.
13008
13009'``llvm.clear_cache``' Intrinsic
13010^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13011
13012Syntax:
13013"""""""
13014
13015::
13016
13017      declare void @llvm.clear_cache(i8*, i8*)
13018
13019Overview:
13020"""""""""
13021
13022The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13023in the specified range to the execution unit of the processor. On
13024targets with non-unified instruction and data cache, the implementation
13025flushes the instruction cache.
13026
13027Semantics:
13028""""""""""
13029
13030On platforms with coherent instruction and data caches (e.g. x86), this
13031intrinsic is a nop. On platforms with non-coherent instruction and data
13032cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13033instructions or a system call, if cache flushing requires special
13034privileges.
13035
13036The default behavior is to emit a call to ``__clear_cache`` from the run
13037time library.
13038
13039This intrinsic does *not* empty the instruction pipeline. Modifications
13040of the current function are outside the scope of the intrinsic.
13041
13042'``llvm.instrprof.increment``' Intrinsic
13043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13044
13045Syntax:
13046"""""""
13047
13048::
13049
13050      declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
13051                                             i32 <num-counters>, i32 <index>)
13052
13053Overview:
13054"""""""""
13055
13056The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13057frontend for use with instrumentation based profiling. These will be
13058lowered by the ``-instrprof`` pass to generate execution counts of a
13059program at runtime.
13060
13061Arguments:
13062""""""""""
13063
13064The first argument is a pointer to a global variable containing the
13065name of the entity being instrumented. This should generally be the
13066(mangled) function name for a set of counters.
13067
13068The second argument is a hash value that can be used by the consumer
13069of the profile data to detect changes to the instrumented source, and
13070the third is the number of counters associated with ``name``. It is an
13071error if ``hash`` or ``num-counters`` differ between two instances of
13072``instrprof.increment`` that refer to the same name.
13073
13074The last argument refers to which of the counters for ``name`` should
13075be incremented. It should be a value between 0 and ``num-counters``.
13076
13077Semantics:
13078""""""""""
13079
13080This intrinsic represents an increment of a profiling counter. It will
13081cause the ``-instrprof`` pass to generate the appropriate data
13082structures and the code to increment the appropriate value, in a
13083format that can be written out by a compiler runtime and consumed via
13084the ``llvm-profdata`` tool.
13085
13086'``llvm.instrprof.increment.step``' Intrinsic
13087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13088
13089Syntax:
13090"""""""
13091
13092::
13093
13094      declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
13095                                                  i32 <num-counters>,
13096                                                  i32 <index>, i64 <step>)
13097
13098Overview:
13099"""""""""
13100
13101The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13102the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13103argument to specify the step of the increment.
13104
13105Arguments:
13106""""""""""
13107The first four arguments are the same as '``llvm.instrprof.increment``'
13108intrinsic.
13109
13110The last argument specifies the value of the increment of the counter variable.
13111
13112Semantics:
13113""""""""""
13114See description of '``llvm.instrprof.increment``' intrinsic.
13115
13116
13117'``llvm.instrprof.value.profile``' Intrinsic
13118^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13119
13120Syntax:
13121"""""""
13122
13123::
13124
13125      declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
13126                                                 i64 <value>, i32 <value_kind>,
13127                                                 i32 <index>)
13128
13129Overview:
13130"""""""""
13131
13132The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13133frontend for use with instrumentation based profiling. This will be
13134lowered by the ``-instrprof`` pass to find out the target values,
13135instrumented expressions take in a program at runtime.
13136
13137Arguments:
13138""""""""""
13139
13140The first argument is a pointer to a global variable containing the
13141name of the entity being instrumented. ``name`` should generally be the
13142(mangled) function name for a set of counters.
13143
13144The second argument is a hash value that can be used by the consumer
13145of the profile data to detect changes to the instrumented source. It
13146is an error if ``hash`` differs between two instances of
13147``llvm.instrprof.*`` that refer to the same name.
13148
13149The third argument is the value of the expression being profiled. The profiled
13150expression's value should be representable as an unsigned 64-bit value. The
13151fourth argument represents the kind of value profiling that is being done. The
13152supported value profiling kinds are enumerated through the
13153``InstrProfValueKind`` type declared in the
13154``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13155index of the instrumented expression within ``name``. It should be >= 0.
13156
13157Semantics:
13158""""""""""
13159
13160This intrinsic represents the point where a call to a runtime routine
13161should be inserted for value profiling of target expressions. ``-instrprof``
13162pass will generate the appropriate data structures and replace the
13163``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13164runtime library with proper arguments.
13165
13166'``llvm.thread.pointer``' Intrinsic
13167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13168
13169Syntax:
13170"""""""
13171
13172::
13173
13174      declare i8* @llvm.thread.pointer()
13175
13176Overview:
13177"""""""""
13178
13179The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13180pointer.
13181
13182Semantics:
13183""""""""""
13184
13185The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13186for the current thread.  The exact semantics of this value are target
13187specific: it may point to the start of TLS area, to the end, or somewhere
13188in the middle.  Depending on the target, this intrinsic may read a register,
13189call a helper function, read from an alternate memory space, or perform
13190other operations necessary to locate the TLS area.  Not all targets support
13191this intrinsic.
13192
13193'``llvm.call.preallocated.setup``' Intrinsic
13194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13195
13196Syntax:
13197"""""""
13198
13199::
13200
13201      declare token @llvm.call.preallocated.setup(i32 %num_args)
13202
13203Overview:
13204"""""""""
13205
13206The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13207be used with a call's ``"preallocated"`` operand bundle to indicate that
13208certain arguments are allocated and initialized before the call.
13209
13210Semantics:
13211""""""""""
13212
13213The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13214associated with at most one call. The token can be passed to
13215'``@llvm.call.preallocated.arg``' to get a pointer to get that
13216corresponding argument. The token must be the parameter to a
13217``"preallocated"`` operand bundle for the corresponding call.
13218
13219Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13220be properly nested. e.g.
13221
13222:: code-block:: llvm
13223
13224      %t1 = call token @llvm.call.preallocated.setup(i32 0)
13225      %t2 = call token @llvm.call.preallocated.setup(i32 0)
13226      call void foo() ["preallocated"(token %t2)]
13227      call void foo() ["preallocated"(token %t1)]
13228
13229is allowed, but not
13230
13231:: code-block:: llvm
13232
13233      %t1 = call token @llvm.call.preallocated.setup(i32 0)
13234      %t2 = call token @llvm.call.preallocated.setup(i32 0)
13235      call void foo() ["preallocated"(token %t1)]
13236      call void foo() ["preallocated"(token %t2)]
13237
13238.. _int_call_preallocated_arg:
13239
13240'``llvm.call.preallocated.arg``' Intrinsic
13241^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13242
13243Syntax:
13244"""""""
13245
13246::
13247
13248      declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13249
13250Overview:
13251"""""""""
13252
13253The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13254corresponding preallocated argument for the preallocated call.
13255
13256Semantics:
13257""""""""""
13258
13259The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13260``%arg_index``th argument with the ``preallocated`` attribute for
13261the call associated with the ``%setup_token``, which must be from
13262'``llvm.call.preallocated.setup``'.
13263
13264A call to '``llvm.call.preallocated.arg``' must have a call site
13265``preallocated`` attribute. The type of the ``preallocated`` attribute must
13266match the type used by the ``preallocated`` attribute of the corresponding
13267argument at the preallocated call. The type is used in the case that an
13268``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13269to DCE), where otherwise we cannot know how large the arguments are.
13270
13271It is undefined behavior if this is called with a token from an
13272'``llvm.call.preallocated.setup``' if another
13273'``llvm.call.preallocated.setup``' has already been called or if the
13274preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13275has already been called.
13276
13277.. _int_call_preallocated_teardown:
13278
13279'``llvm.call.preallocated.teardown``' Intrinsic
13280^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13281
13282Syntax:
13283"""""""
13284
13285::
13286
13287      declare i8* @llvm.call.preallocated.teardown(token %setup_token)
13288
13289Overview:
13290"""""""""
13291
13292The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13293created by a '``llvm.call.preallocated.setup``'.
13294
13295Semantics:
13296""""""""""
13297
13298The token argument must be a '``llvm.call.preallocated.setup``'.
13299
13300The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13301allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13302one of this or the preallocated call must be called to prevent stack leaks.
13303It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13304and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13305
13306For example, if the stack is allocated for a preallocated call by a
13307'``llvm.call.preallocated.setup``', then an initializer function called on an
13308allocated argument throws an exception, there should be a
13309'``llvm.call.preallocated.teardown``' in the exception handler to prevent
13310stack leaks.
13311
13312Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13313calls to '``llvm.call.preallocated.setup``' and
13314'``llvm.call.preallocated.teardown``' are allowed but must be properly
13315nested.
13316
13317Example:
13318""""""""
13319
13320.. code-block:: llvm
13321
13322        %cs = call token @llvm.call.preallocated.setup(i32 1)
13323        %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13324        %y = bitcast i8* %x to i32*
13325        invoke void @constructor(i32* %y) to label %conta unwind label %contb
13326    conta:
13327        call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)]
13328        ret void
13329    contb:
13330        %s = catchswitch within none [label %catch] unwind to caller
13331    catch:
13332        %p = catchpad within %s []
13333        call void @llvm.call.preallocated.teardown(token %cs)
13334        ret void
13335
13336Standard C/C++ Library Intrinsics
13337---------------------------------
13338
13339LLVM provides intrinsics for a few important standard C/C++ library
13340functions. These intrinsics allow source-language front-ends to pass
13341information about the alignment of the pointer arguments to the code
13342generator, providing opportunity for more efficient code generation.
13343
13344
13345'``llvm.abs.*``' Intrinsic
13346^^^^^^^^^^^^^^^^^^^^^^^^^^
13347
13348Syntax:
13349"""""""
13350
13351This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13352integer bit width or any vector of integer elements.
13353
13354::
13355
13356      declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13357      declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13358
13359Overview:
13360"""""""""
13361
13362The '``llvm.abs``' family of intrinsic functions returns the absolute value
13363of an argument.
13364
13365Arguments:
13366""""""""""
13367
13368The first argument is the value for which the absolute value is to be returned.
13369This argument may be of any integer type or a vector with integer element type.
13370The return type must match the first argument type.
13371
13372The second argument must be a constant and is a flag to indicate whether the
13373result value of the '``llvm.abs``' intrinsic is a
13374:ref:`poison value <poisonvalues>` if the argument is statically or dynamically
13375an ``INT_MIN`` value.
13376
13377Semantics:
13378""""""""""
13379
13380The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
13381argument or each element of a vector argument.". If the argument is ``INT_MIN``,
13382then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
13383``poison`` otherwise.
13384
13385
13386'``llvm.smax.*``' Intrinsic
13387^^^^^^^^^^^^^^^^^^^^^^^^^^^
13388
13389Syntax:
13390"""""""
13391
13392This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
13393integer bit width or any vector of integer elements.
13394
13395::
13396
13397      declare i32 @llvm.smax.i32(i32 %a, i32 %b)
13398      declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
13399
13400Overview:
13401"""""""""
13402
13403Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
13404Vector intrinsics operate on a per-element basis. The larger element of ``%a``
13405and ``%b`` at a given index is returned for that index.
13406
13407Arguments:
13408""""""""""
13409
13410The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13411integer element type. The argument types must match each other, and the return
13412type must match the argument type.
13413
13414
13415'``llvm.smin.*``' Intrinsic
13416^^^^^^^^^^^^^^^^^^^^^^^^^^^
13417
13418Syntax:
13419"""""""
13420
13421This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
13422integer bit width or any vector of integer elements.
13423
13424::
13425
13426      declare i32 @llvm.smin.i32(i32 %a, i32 %b)
13427      declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
13428
13429Overview:
13430"""""""""
13431
13432Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
13433Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
13434and ``%b`` at a given index is returned for that index.
13435
13436Arguments:
13437""""""""""
13438
13439The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13440integer element type. The argument types must match each other, and the return
13441type must match the argument type.
13442
13443
13444'``llvm.umax.*``' Intrinsic
13445^^^^^^^^^^^^^^^^^^^^^^^^^^^
13446
13447Syntax:
13448"""""""
13449
13450This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
13451integer bit width or any vector of integer elements.
13452
13453::
13454
13455      declare i32 @llvm.umax.i32(i32 %a, i32 %b)
13456      declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
13457
13458Overview:
13459"""""""""
13460
13461Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
13462integers. Vector intrinsics operate on a per-element basis. The larger element
13463of ``%a`` and ``%b`` at a given index is returned for that index.
13464
13465Arguments:
13466""""""""""
13467
13468The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13469integer element type. The argument types must match each other, and the return
13470type must match the argument type.
13471
13472
13473'``llvm.umin.*``' Intrinsic
13474^^^^^^^^^^^^^^^^^^^^^^^^^^^
13475
13476Syntax:
13477"""""""
13478
13479This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
13480integer bit width or any vector of integer elements.
13481
13482::
13483
13484      declare i32 @llvm.umin.i32(i32 %a, i32 %b)
13485      declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
13486
13487Overview:
13488"""""""""
13489
13490Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
13491integers. Vector intrinsics operate on a per-element basis. The smaller element
13492of ``%a`` and ``%b`` at a given index is returned for that index.
13493
13494Arguments:
13495""""""""""
13496
13497The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13498integer element type. The argument types must match each other, and the return
13499type must match the argument type.
13500
13501
13502.. _int_memcpy:
13503
13504'``llvm.memcpy``' Intrinsic
13505^^^^^^^^^^^^^^^^^^^^^^^^^^^
13506
13507Syntax:
13508"""""""
13509
13510This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
13511integer bit width and for different address spaces. Not all targets
13512support all bit widths however.
13513
13514::
13515
13516      declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13517                                              i32 <len>, i1 <isvolatile>)
13518      declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13519                                              i64 <len>, i1 <isvolatile>)
13520
13521Overview:
13522"""""""""
13523
13524The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
13525source location to the destination location.
13526
13527Note that, unlike the standard libc function, the ``llvm.memcpy.*``
13528intrinsics do not return a value, takes extra isvolatile
13529arguments and the pointers can be in specified address spaces.
13530
13531Arguments:
13532""""""""""
13533
13534The first argument is a pointer to the destination, the second is a
13535pointer to the source. The third argument is an integer argument
13536specifying the number of bytes to copy, and the fourth is a
13537boolean indicating a volatile access.
13538
13539The :ref:`align <attr_align>` parameter attribute can be provided
13540for the first and second arguments.
13541
13542If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
13543a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13544very cleanly specified and it is unwise to depend on it.
13545
13546Semantics:
13547""""""""""
13548
13549The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
13550location to the destination location, which must either be equal or
13551non-overlapping. It copies "len" bytes of memory over. If the argument is known
13552to be aligned to some boundary, this can be specified as an attribute on the
13553argument.
13554
13555If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13556the arguments.
13557If ``<len>`` is not a well-defined value, the behavior is undefined.
13558If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13559otherwise the behavior is undefined.
13560
13561.. _int_memcpy_inline:
13562
13563'``llvm.memcpy.inline``' Intrinsic
13564^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13565
13566Syntax:
13567"""""""
13568
13569This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
13570integer bit width and for different address spaces. Not all targets
13571support all bit widths however.
13572
13573::
13574
13575      declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13576                                                     i32 <len>, i1 <isvolatile>)
13577      declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13578                                                     i64 <len>, i1 <isvolatile>)
13579
13580Overview:
13581"""""""""
13582
13583The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13584source location to the destination location and guarantees that no external
13585functions are called.
13586
13587Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
13588intrinsics do not return a value, takes extra isvolatile
13589arguments and the pointers can be in specified address spaces.
13590
13591Arguments:
13592""""""""""
13593
13594The first argument is a pointer to the destination, the second is a
13595pointer to the source. The third argument is a constant integer argument
13596specifying the number of bytes to copy, and the fourth is a
13597boolean indicating a volatile access.
13598
13599The :ref:`align <attr_align>` parameter attribute can be provided
13600for the first and second arguments.
13601
13602If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
13603a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13604very cleanly specified and it is unwise to depend on it.
13605
13606Semantics:
13607""""""""""
13608
13609The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13610source location to the destination location, which are not allowed to
13611overlap. It copies "len" bytes of memory over. If the argument is known
13612to be aligned to some boundary, this can be specified as an attribute on
13613the argument.
13614The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
13615'``llvm.memcpy.*``', but the generated code is guaranteed not to call any
13616external functions.
13617
13618.. _int_memmove:
13619
13620'``llvm.memmove``' Intrinsic
13621^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13622
13623Syntax:
13624"""""""
13625
13626This is an overloaded intrinsic. You can use llvm.memmove on any integer
13627bit width and for different address space. Not all targets support all
13628bit widths however.
13629
13630::
13631
13632      declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13633                                               i32 <len>, i1 <isvolatile>)
13634      declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13635                                               i64 <len>, i1 <isvolatile>)
13636
13637Overview:
13638"""""""""
13639
13640The '``llvm.memmove.*``' intrinsics move a block of memory from the
13641source location to the destination location. It is similar to the
13642'``llvm.memcpy``' intrinsic but allows the two memory locations to
13643overlap.
13644
13645Note that, unlike the standard libc function, the ``llvm.memmove.*``
13646intrinsics do not return a value, takes an extra isvolatile
13647argument and the pointers can be in specified address spaces.
13648
13649Arguments:
13650""""""""""
13651
13652The first argument is a pointer to the destination, the second is a
13653pointer to the source. The third argument is an integer argument
13654specifying the number of bytes to copy, and the fourth is a
13655boolean indicating a volatile access.
13656
13657The :ref:`align <attr_align>` parameter attribute can be provided
13658for the first and second arguments.
13659
13660If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
13661is a :ref:`volatile operation <volatile>`. The detailed access behavior is
13662not very cleanly specified and it is unwise to depend on it.
13663
13664Semantics:
13665""""""""""
13666
13667The '``llvm.memmove.*``' intrinsics copy a block of memory from the
13668source location to the destination location, which may overlap. It
13669copies "len" bytes of memory over. If the argument is known to be
13670aligned to some boundary, this can be specified as an attribute on
13671the argument.
13672
13673If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13674the arguments.
13675If ``<len>`` is not a well-defined value, the behavior is undefined.
13676If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13677otherwise the behavior is undefined.
13678
13679.. _int_memset:
13680
13681'``llvm.memset.*``' Intrinsics
13682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13683
13684Syntax:
13685"""""""
13686
13687This is an overloaded intrinsic. You can use llvm.memset on any integer
13688bit width and for different address spaces. However, not all targets
13689support all bit widths.
13690
13691::
13692
13693      declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
13694                                         i32 <len>, i1 <isvolatile>)
13695      declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
13696                                         i64 <len>, i1 <isvolatile>)
13697
13698Overview:
13699"""""""""
13700
13701The '``llvm.memset.*``' intrinsics fill a block of memory with a
13702particular byte value.
13703
13704Note that, unlike the standard libc function, the ``llvm.memset``
13705intrinsic does not return a value and takes an extra volatile
13706argument. Also, the destination can be in an arbitrary address space.
13707
13708Arguments:
13709""""""""""
13710
13711The first argument is a pointer to the destination to fill, the second
13712is the byte value with which to fill it, the third argument is an
13713integer argument specifying the number of bytes to fill, and the fourth
13714is a boolean indicating a volatile access.
13715
13716The :ref:`align <attr_align>` parameter attribute can be provided
13717for the first arguments.
13718
13719If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
13720a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13721very cleanly specified and it is unwise to depend on it.
13722
13723Semantics:
13724""""""""""
13725
13726The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
13727at the destination location. If the argument is known to be
13728aligned to some boundary, this can be specified as an attribute on
13729the argument.
13730
13731If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13732the arguments.
13733If ``<len>`` is not a well-defined value, the behavior is undefined.
13734If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13735otherwise the behavior is undefined.
13736
13737'``llvm.sqrt.*``' Intrinsic
13738^^^^^^^^^^^^^^^^^^^^^^^^^^^
13739
13740Syntax:
13741"""""""
13742
13743This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
13744floating-point or vector of floating-point type. Not all targets support
13745all types however.
13746
13747::
13748
13749      declare float     @llvm.sqrt.f32(float %Val)
13750      declare double    @llvm.sqrt.f64(double %Val)
13751      declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
13752      declare fp128     @llvm.sqrt.f128(fp128 %Val)
13753      declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
13754
13755Overview:
13756"""""""""
13757
13758The '``llvm.sqrt``' intrinsics return the square root of the specified value.
13759
13760Arguments:
13761""""""""""
13762
13763The argument and return value are floating-point numbers of the same type.
13764
13765Semantics:
13766""""""""""
13767
13768Return the same value as a corresponding libm '``sqrt``' function but without
13769trapping or setting ``errno``. For types specified by IEEE-754, the result
13770matches a conforming libm implementation.
13771
13772When specified with the fast-math-flag 'afn', the result may be approximated
13773using a less accurate calculation.
13774
13775'``llvm.powi.*``' Intrinsic
13776^^^^^^^^^^^^^^^^^^^^^^^^^^^
13777
13778Syntax:
13779"""""""
13780
13781This is an overloaded intrinsic. You can use ``llvm.powi`` on any
13782floating-point or vector of floating-point type. Not all targets support
13783all types however.
13784
13785Generally, the only supported type for the exponent is the one matching
13786with the C type ``int``.
13787
13788::
13789
13790      declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
13791      declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
13792      declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
13793      declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
13794      declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
13795
13796Overview:
13797"""""""""
13798
13799The '``llvm.powi.*``' intrinsics return the first operand raised to the
13800specified (positive or negative) power. The order of evaluation of
13801multiplications is not defined. When a vector of floating-point type is
13802used, the second argument remains a scalar integer value.
13803
13804Arguments:
13805""""""""""
13806
13807The second argument is an integer power, and the first is a value to
13808raise to that power.
13809
13810Semantics:
13811""""""""""
13812
13813This function returns the first value raised to the second power with an
13814unspecified sequence of rounding operations.
13815
13816'``llvm.sin.*``' Intrinsic
13817^^^^^^^^^^^^^^^^^^^^^^^^^^
13818
13819Syntax:
13820"""""""
13821
13822This is an overloaded intrinsic. You can use ``llvm.sin`` on any
13823floating-point or vector of floating-point type. Not all targets support
13824all types however.
13825
13826::
13827
13828      declare float     @llvm.sin.f32(float  %Val)
13829      declare double    @llvm.sin.f64(double %Val)
13830      declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
13831      declare fp128     @llvm.sin.f128(fp128 %Val)
13832      declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
13833
13834Overview:
13835"""""""""
13836
13837The '``llvm.sin.*``' intrinsics return the sine of the operand.
13838
13839Arguments:
13840""""""""""
13841
13842The argument and return value are floating-point numbers of the same type.
13843
13844Semantics:
13845""""""""""
13846
13847Return the same value as a corresponding libm '``sin``' function but without
13848trapping or setting ``errno``.
13849
13850When specified with the fast-math-flag 'afn', the result may be approximated
13851using a less accurate calculation.
13852
13853'``llvm.cos.*``' Intrinsic
13854^^^^^^^^^^^^^^^^^^^^^^^^^^
13855
13856Syntax:
13857"""""""
13858
13859This is an overloaded intrinsic. You can use ``llvm.cos`` on any
13860floating-point or vector of floating-point type. Not all targets support
13861all types however.
13862
13863::
13864
13865      declare float     @llvm.cos.f32(float  %Val)
13866      declare double    @llvm.cos.f64(double %Val)
13867      declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
13868      declare fp128     @llvm.cos.f128(fp128 %Val)
13869      declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
13870
13871Overview:
13872"""""""""
13873
13874The '``llvm.cos.*``' intrinsics return the cosine of the operand.
13875
13876Arguments:
13877""""""""""
13878
13879The argument and return value are floating-point numbers of the same type.
13880
13881Semantics:
13882""""""""""
13883
13884Return the same value as a corresponding libm '``cos``' function but without
13885trapping or setting ``errno``.
13886
13887When specified with the fast-math-flag 'afn', the result may be approximated
13888using a less accurate calculation.
13889
13890'``llvm.pow.*``' Intrinsic
13891^^^^^^^^^^^^^^^^^^^^^^^^^^
13892
13893Syntax:
13894"""""""
13895
13896This is an overloaded intrinsic. You can use ``llvm.pow`` on any
13897floating-point or vector of floating-point type. Not all targets support
13898all types however.
13899
13900::
13901
13902      declare float     @llvm.pow.f32(float  %Val, float %Power)
13903      declare double    @llvm.pow.f64(double %Val, double %Power)
13904      declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
13905      declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
13906      declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
13907
13908Overview:
13909"""""""""
13910
13911The '``llvm.pow.*``' intrinsics return the first operand raised to the
13912specified (positive or negative) power.
13913
13914Arguments:
13915""""""""""
13916
13917The arguments and return value are floating-point numbers of the same type.
13918
13919Semantics:
13920""""""""""
13921
13922Return the same value as a corresponding libm '``pow``' function but without
13923trapping or setting ``errno``.
13924
13925When specified with the fast-math-flag 'afn', the result may be approximated
13926using a less accurate calculation.
13927
13928'``llvm.exp.*``' Intrinsic
13929^^^^^^^^^^^^^^^^^^^^^^^^^^
13930
13931Syntax:
13932"""""""
13933
13934This is an overloaded intrinsic. You can use ``llvm.exp`` on any
13935floating-point or vector of floating-point type. Not all targets support
13936all types however.
13937
13938::
13939
13940      declare float     @llvm.exp.f32(float  %Val)
13941      declare double    @llvm.exp.f64(double %Val)
13942      declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
13943      declare fp128     @llvm.exp.f128(fp128 %Val)
13944      declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
13945
13946Overview:
13947"""""""""
13948
13949The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
13950value.
13951
13952Arguments:
13953""""""""""
13954
13955The argument and return value are floating-point numbers of the same type.
13956
13957Semantics:
13958""""""""""
13959
13960Return the same value as a corresponding libm '``exp``' function but without
13961trapping or setting ``errno``.
13962
13963When specified with the fast-math-flag 'afn', the result may be approximated
13964using a less accurate calculation.
13965
13966'``llvm.exp2.*``' Intrinsic
13967^^^^^^^^^^^^^^^^^^^^^^^^^^^
13968
13969Syntax:
13970"""""""
13971
13972This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
13973floating-point or vector of floating-point type. Not all targets support
13974all types however.
13975
13976::
13977
13978      declare float     @llvm.exp2.f32(float  %Val)
13979      declare double    @llvm.exp2.f64(double %Val)
13980      declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
13981      declare fp128     @llvm.exp2.f128(fp128 %Val)
13982      declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
13983
13984Overview:
13985"""""""""
13986
13987The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
13988specified value.
13989
13990Arguments:
13991""""""""""
13992
13993The argument and return value are floating-point numbers of the same type.
13994
13995Semantics:
13996""""""""""
13997
13998Return the same value as a corresponding libm '``exp2``' function but without
13999trapping or setting ``errno``.
14000
14001When specified with the fast-math-flag 'afn', the result may be approximated
14002using a less accurate calculation.
14003
14004'``llvm.log.*``' Intrinsic
14005^^^^^^^^^^^^^^^^^^^^^^^^^^
14006
14007Syntax:
14008"""""""
14009
14010This is an overloaded intrinsic. You can use ``llvm.log`` on any
14011floating-point or vector of floating-point type. Not all targets support
14012all types however.
14013
14014::
14015
14016      declare float     @llvm.log.f32(float  %Val)
14017      declare double    @llvm.log.f64(double %Val)
14018      declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
14019      declare fp128     @llvm.log.f128(fp128 %Val)
14020      declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
14021
14022Overview:
14023"""""""""
14024
14025The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
14026value.
14027
14028Arguments:
14029""""""""""
14030
14031The argument and return value are floating-point numbers of the same type.
14032
14033Semantics:
14034""""""""""
14035
14036Return the same value as a corresponding libm '``log``' function but without
14037trapping or setting ``errno``.
14038
14039When specified with the fast-math-flag 'afn', the result may be approximated
14040using a less accurate calculation.
14041
14042'``llvm.log10.*``' Intrinsic
14043^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14044
14045Syntax:
14046"""""""
14047
14048This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14049floating-point or vector of floating-point type. Not all targets support
14050all types however.
14051
14052::
14053
14054      declare float     @llvm.log10.f32(float  %Val)
14055      declare double    @llvm.log10.f64(double %Val)
14056      declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
14057      declare fp128     @llvm.log10.f128(fp128 %Val)
14058      declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
14059
14060Overview:
14061"""""""""
14062
14063The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14064specified value.
14065
14066Arguments:
14067""""""""""
14068
14069The argument and return value are floating-point numbers of the same type.
14070
14071Semantics:
14072""""""""""
14073
14074Return the same value as a corresponding libm '``log10``' function but without
14075trapping or setting ``errno``.
14076
14077When specified with the fast-math-flag 'afn', the result may be approximated
14078using a less accurate calculation.
14079
14080'``llvm.log2.*``' Intrinsic
14081^^^^^^^^^^^^^^^^^^^^^^^^^^^
14082
14083Syntax:
14084"""""""
14085
14086This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14087floating-point or vector of floating-point type. Not all targets support
14088all types however.
14089
14090::
14091
14092      declare float     @llvm.log2.f32(float  %Val)
14093      declare double    @llvm.log2.f64(double %Val)
14094      declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
14095      declare fp128     @llvm.log2.f128(fp128 %Val)
14096      declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
14097
14098Overview:
14099"""""""""
14100
14101The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14102value.
14103
14104Arguments:
14105""""""""""
14106
14107The argument and return value are floating-point numbers of the same type.
14108
14109Semantics:
14110""""""""""
14111
14112Return the same value as a corresponding libm '``log2``' function but without
14113trapping or setting ``errno``.
14114
14115When specified with the fast-math-flag 'afn', the result may be approximated
14116using a less accurate calculation.
14117
14118.. _int_fma:
14119
14120'``llvm.fma.*``' Intrinsic
14121^^^^^^^^^^^^^^^^^^^^^^^^^^
14122
14123Syntax:
14124"""""""
14125
14126This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14127floating-point or vector of floating-point type. Not all targets support
14128all types however.
14129
14130::
14131
14132      declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
14133      declare double    @llvm.fma.f64(double %a, double %b, double %c)
14134      declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14135      declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14136      declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14137
14138Overview:
14139"""""""""
14140
14141The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14142
14143Arguments:
14144""""""""""
14145
14146The arguments and return value are floating-point numbers of the same type.
14147
14148Semantics:
14149""""""""""
14150
14151Return the same value as a corresponding libm '``fma``' function but without
14152trapping or setting ``errno``.
14153
14154When specified with the fast-math-flag 'afn', the result may be approximated
14155using a less accurate calculation.
14156
14157'``llvm.fabs.*``' Intrinsic
14158^^^^^^^^^^^^^^^^^^^^^^^^^^^
14159
14160Syntax:
14161"""""""
14162
14163This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14164floating-point or vector of floating-point type. Not all targets support
14165all types however.
14166
14167::
14168
14169      declare float     @llvm.fabs.f32(float  %Val)
14170      declare double    @llvm.fabs.f64(double %Val)
14171      declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
14172      declare fp128     @llvm.fabs.f128(fp128 %Val)
14173      declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14174
14175Overview:
14176"""""""""
14177
14178The '``llvm.fabs.*``' intrinsics return the absolute value of the
14179operand.
14180
14181Arguments:
14182""""""""""
14183
14184The argument and return value are floating-point numbers of the same
14185type.
14186
14187Semantics:
14188""""""""""
14189
14190This function returns the same values as the libm ``fabs`` functions
14191would, and handles error conditions in the same way.
14192
14193'``llvm.minnum.*``' Intrinsic
14194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14195
14196Syntax:
14197"""""""
14198
14199This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14200floating-point or vector of floating-point type. Not all targets support
14201all types however.
14202
14203::
14204
14205      declare float     @llvm.minnum.f32(float %Val0, float %Val1)
14206      declare double    @llvm.minnum.f64(double %Val0, double %Val1)
14207      declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14208      declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14209      declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14210
14211Overview:
14212"""""""""
14213
14214The '``llvm.minnum.*``' intrinsics return the minimum of the two
14215arguments.
14216
14217
14218Arguments:
14219""""""""""
14220
14221The arguments and return value are floating-point numbers of the same
14222type.
14223
14224Semantics:
14225""""""""""
14226
14227Follows the IEEE-754 semantics for minNum, except for handling of
14228signaling NaNs. This match's the behavior of libm's fmin.
14229
14230If either operand is a NaN, returns the other non-NaN operand. Returns
14231NaN only if both operands are NaN. The returned NaN is always
14232quiet. If the operands compare equal, returns a value that compares
14233equal to both operands. This means that fmin(+/-0.0, +/-0.0) could
14234return either -0.0 or 0.0.
14235
14236Unlike the IEEE-754 2008 behavior, this does not distinguish between
14237signaling and quiet NaN inputs. If a target's implementation follows
14238the standard and returns a quiet NaN if either input is a signaling
14239NaN, the intrinsic lowering is responsible for quieting the inputs to
14240correctly return the non-NaN input (e.g. by using the equivalent of
14241``llvm.canonicalize``).
14242
14243
14244'``llvm.maxnum.*``' Intrinsic
14245^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14246
14247Syntax:
14248"""""""
14249
14250This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14251floating-point or vector of floating-point type. Not all targets support
14252all types however.
14253
14254::
14255
14256      declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
14257      declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
14258      declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
14259      declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14260      declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
14261
14262Overview:
14263"""""""""
14264
14265The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14266arguments.
14267
14268
14269Arguments:
14270""""""""""
14271
14272The arguments and return value are floating-point numbers of the same
14273type.
14274
14275Semantics:
14276""""""""""
14277Follows the IEEE-754 semantics for maxNum except for the handling of
14278signaling NaNs. This matches the behavior of libm's fmax.
14279
14280If either operand is a NaN, returns the other non-NaN operand. Returns
14281NaN only if both operands are NaN. The returned NaN is always
14282quiet. If the operands compare equal, returns a value that compares
14283equal to both operands. This means that fmax(+/-0.0, +/-0.0) could
14284return either -0.0 or 0.0.
14285
14286Unlike the IEEE-754 2008 behavior, this does not distinguish between
14287signaling and quiet NaN inputs. If a target's implementation follows
14288the standard and returns a quiet NaN if either input is a signaling
14289NaN, the intrinsic lowering is responsible for quieting the inputs to
14290correctly return the non-NaN input (e.g. by using the equivalent of
14291``llvm.canonicalize``).
14292
14293'``llvm.minimum.*``' Intrinsic
14294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14295
14296Syntax:
14297"""""""
14298
14299This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
14300floating-point or vector of floating-point type. Not all targets support
14301all types however.
14302
14303::
14304
14305      declare float     @llvm.minimum.f32(float %Val0, float %Val1)
14306      declare double    @llvm.minimum.f64(double %Val0, double %Val1)
14307      declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14308      declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
14309      declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14310
14311Overview:
14312"""""""""
14313
14314The '``llvm.minimum.*``' intrinsics return the minimum of the two
14315arguments, propagating NaNs and treating -0.0 as less than +0.0.
14316
14317
14318Arguments:
14319""""""""""
14320
14321The arguments and return value are floating-point numbers of the same
14322type.
14323
14324Semantics:
14325""""""""""
14326If either operand is a NaN, returns NaN. Otherwise returns the lesser
14327of the two arguments. -0.0 is considered to be less than +0.0 for this
14328intrinsic. Note that these are the semantics specified in the draft of
14329IEEE 754-2018.
14330
14331'``llvm.maximum.*``' Intrinsic
14332^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14333
14334Syntax:
14335"""""""
14336
14337This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
14338floating-point or vector of floating-point type. Not all targets support
14339all types however.
14340
14341::
14342
14343      declare float     @llvm.maximum.f32(float %Val0, float %Val1)
14344      declare double    @llvm.maximum.f64(double %Val0, double %Val1)
14345      declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14346      declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
14347      declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14348
14349Overview:
14350"""""""""
14351
14352The '``llvm.maximum.*``' intrinsics return the maximum of the two
14353arguments, propagating NaNs and treating -0.0 as less than +0.0.
14354
14355
14356Arguments:
14357""""""""""
14358
14359The arguments and return value are floating-point numbers of the same
14360type.
14361
14362Semantics:
14363""""""""""
14364If either operand is a NaN, returns NaN. Otherwise returns the greater
14365of the two arguments. -0.0 is considered to be less than +0.0 for this
14366intrinsic. Note that these are the semantics specified in the draft of
14367IEEE 754-2018.
14368
14369'``llvm.copysign.*``' Intrinsic
14370^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14371
14372Syntax:
14373"""""""
14374
14375This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
14376floating-point or vector of floating-point type. Not all targets support
14377all types however.
14378
14379::
14380
14381      declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
14382      declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
14383      declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
14384      declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
14385      declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
14386
14387Overview:
14388"""""""""
14389
14390The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
14391first operand and the sign of the second operand.
14392
14393Arguments:
14394""""""""""
14395
14396The arguments and return value are floating-point numbers of the same
14397type.
14398
14399Semantics:
14400""""""""""
14401
14402This function returns the same values as the libm ``copysign``
14403functions would, and handles error conditions in the same way.
14404
14405'``llvm.floor.*``' Intrinsic
14406^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14407
14408Syntax:
14409"""""""
14410
14411This is an overloaded intrinsic. You can use ``llvm.floor`` on any
14412floating-point or vector of floating-point type. Not all targets support
14413all types however.
14414
14415::
14416
14417      declare float     @llvm.floor.f32(float  %Val)
14418      declare double    @llvm.floor.f64(double %Val)
14419      declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
14420      declare fp128     @llvm.floor.f128(fp128 %Val)
14421      declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
14422
14423Overview:
14424"""""""""
14425
14426The '``llvm.floor.*``' intrinsics return the floor of the operand.
14427
14428Arguments:
14429""""""""""
14430
14431The argument and return value are floating-point numbers of the same
14432type.
14433
14434Semantics:
14435""""""""""
14436
14437This function returns the same values as the libm ``floor`` functions
14438would, and handles error conditions in the same way.
14439
14440'``llvm.ceil.*``' Intrinsic
14441^^^^^^^^^^^^^^^^^^^^^^^^^^^
14442
14443Syntax:
14444"""""""
14445
14446This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
14447floating-point or vector of floating-point type. Not all targets support
14448all types however.
14449
14450::
14451
14452      declare float     @llvm.ceil.f32(float  %Val)
14453      declare double    @llvm.ceil.f64(double %Val)
14454      declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
14455      declare fp128     @llvm.ceil.f128(fp128 %Val)
14456      declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
14457
14458Overview:
14459"""""""""
14460
14461The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
14462
14463Arguments:
14464""""""""""
14465
14466The argument and return value are floating-point numbers of the same
14467type.
14468
14469Semantics:
14470""""""""""
14471
14472This function returns the same values as the libm ``ceil`` functions
14473would, and handles error conditions in the same way.
14474
14475'``llvm.trunc.*``' Intrinsic
14476^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14477
14478Syntax:
14479"""""""
14480
14481This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
14482floating-point or vector of floating-point type. Not all targets support
14483all types however.
14484
14485::
14486
14487      declare float     @llvm.trunc.f32(float  %Val)
14488      declare double    @llvm.trunc.f64(double %Val)
14489      declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
14490      declare fp128     @llvm.trunc.f128(fp128 %Val)
14491      declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
14492
14493Overview:
14494"""""""""
14495
14496The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
14497nearest integer not larger in magnitude than the operand.
14498
14499Arguments:
14500""""""""""
14501
14502The argument and return value are floating-point numbers of the same
14503type.
14504
14505Semantics:
14506""""""""""
14507
14508This function returns the same values as the libm ``trunc`` functions
14509would, and handles error conditions in the same way.
14510
14511'``llvm.rint.*``' Intrinsic
14512^^^^^^^^^^^^^^^^^^^^^^^^^^^
14513
14514Syntax:
14515"""""""
14516
14517This is an overloaded intrinsic. You can use ``llvm.rint`` on any
14518floating-point or vector of floating-point type. Not all targets support
14519all types however.
14520
14521::
14522
14523      declare float     @llvm.rint.f32(float  %Val)
14524      declare double    @llvm.rint.f64(double %Val)
14525      declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
14526      declare fp128     @llvm.rint.f128(fp128 %Val)
14527      declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
14528
14529Overview:
14530"""""""""
14531
14532The '``llvm.rint.*``' intrinsics returns the operand rounded to the
14533nearest integer. It may raise an inexact floating-point exception if the
14534operand isn't an integer.
14535
14536Arguments:
14537""""""""""
14538
14539The argument and return value are floating-point numbers of the same
14540type.
14541
14542Semantics:
14543""""""""""
14544
14545This function returns the same values as the libm ``rint`` functions
14546would, and handles error conditions in the same way.
14547
14548'``llvm.nearbyint.*``' Intrinsic
14549^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14550
14551Syntax:
14552"""""""
14553
14554This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
14555floating-point or vector of floating-point type. Not all targets support
14556all types however.
14557
14558::
14559
14560      declare float     @llvm.nearbyint.f32(float  %Val)
14561      declare double    @llvm.nearbyint.f64(double %Val)
14562      declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
14563      declare fp128     @llvm.nearbyint.f128(fp128 %Val)
14564      declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
14565
14566Overview:
14567"""""""""
14568
14569The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
14570nearest integer.
14571
14572Arguments:
14573""""""""""
14574
14575The argument and return value are floating-point numbers of the same
14576type.
14577
14578Semantics:
14579""""""""""
14580
14581This function returns the same values as the libm ``nearbyint``
14582functions would, and handles error conditions in the same way.
14583
14584'``llvm.round.*``' Intrinsic
14585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14586
14587Syntax:
14588"""""""
14589
14590This is an overloaded intrinsic. You can use ``llvm.round`` on any
14591floating-point or vector of floating-point type. Not all targets support
14592all types however.
14593
14594::
14595
14596      declare float     @llvm.round.f32(float  %Val)
14597      declare double    @llvm.round.f64(double %Val)
14598      declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
14599      declare fp128     @llvm.round.f128(fp128 %Val)
14600      declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
14601
14602Overview:
14603"""""""""
14604
14605The '``llvm.round.*``' intrinsics returns the operand rounded to the
14606nearest integer.
14607
14608Arguments:
14609""""""""""
14610
14611The argument and return value are floating-point numbers of the same
14612type.
14613
14614Semantics:
14615""""""""""
14616
14617This function returns the same values as the libm ``round``
14618functions would, and handles error conditions in the same way.
14619
14620'``llvm.roundeven.*``' Intrinsic
14621^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14622
14623Syntax:
14624"""""""
14625
14626This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
14627floating-point or vector of floating-point type. Not all targets support
14628all types however.
14629
14630::
14631
14632      declare float     @llvm.roundeven.f32(float  %Val)
14633      declare double    @llvm.roundeven.f64(double %Val)
14634      declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
14635      declare fp128     @llvm.roundeven.f128(fp128 %Val)
14636      declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
14637
14638Overview:
14639"""""""""
14640
14641The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
14642integer in floating-point format rounding halfway cases to even (that is, to the
14643nearest value that is an even integer).
14644
14645Arguments:
14646""""""""""
14647
14648The argument and return value are floating-point numbers of the same type.
14649
14650Semantics:
14651""""""""""
14652
14653This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
14654also behaves in the same way as C standard function ``roundeven``, except that
14655it does not raise floating point exceptions.
14656
14657
14658'``llvm.lround.*``' Intrinsic
14659^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14660
14661Syntax:
14662"""""""
14663
14664This is an overloaded intrinsic. You can use ``llvm.lround`` on any
14665floating-point type. Not all targets support all types however.
14666
14667::
14668
14669      declare i32 @llvm.lround.i32.f32(float %Val)
14670      declare i32 @llvm.lround.i32.f64(double %Val)
14671      declare i32 @llvm.lround.i32.f80(float %Val)
14672      declare i32 @llvm.lround.i32.f128(double %Val)
14673      declare i32 @llvm.lround.i32.ppcf128(double %Val)
14674
14675      declare i64 @llvm.lround.i64.f32(float %Val)
14676      declare i64 @llvm.lround.i64.f64(double %Val)
14677      declare i64 @llvm.lround.i64.f80(float %Val)
14678      declare i64 @llvm.lround.i64.f128(double %Val)
14679      declare i64 @llvm.lround.i64.ppcf128(double %Val)
14680
14681Overview:
14682"""""""""
14683
14684The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
14685integer with ties away from zero.
14686
14687
14688Arguments:
14689""""""""""
14690
14691The argument is a floating-point number and the return value is an integer
14692type.
14693
14694Semantics:
14695""""""""""
14696
14697This function returns the same values as the libm ``lround``
14698functions would, but without setting errno.
14699
14700'``llvm.llround.*``' Intrinsic
14701^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14702
14703Syntax:
14704"""""""
14705
14706This is an overloaded intrinsic. You can use ``llvm.llround`` on any
14707floating-point type. Not all targets support all types however.
14708
14709::
14710
14711      declare i64 @llvm.lround.i64.f32(float %Val)
14712      declare i64 @llvm.lround.i64.f64(double %Val)
14713      declare i64 @llvm.lround.i64.f80(float %Val)
14714      declare i64 @llvm.lround.i64.f128(double %Val)
14715      declare i64 @llvm.lround.i64.ppcf128(double %Val)
14716
14717Overview:
14718"""""""""
14719
14720The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
14721integer with ties away from zero.
14722
14723Arguments:
14724""""""""""
14725
14726The argument is a floating-point number and the return value is an integer
14727type.
14728
14729Semantics:
14730""""""""""
14731
14732This function returns the same values as the libm ``llround``
14733functions would, but without setting errno.
14734
14735'``llvm.lrint.*``' Intrinsic
14736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14737
14738Syntax:
14739"""""""
14740
14741This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
14742floating-point type. Not all targets support all types however.
14743
14744::
14745
14746      declare i32 @llvm.lrint.i32.f32(float %Val)
14747      declare i32 @llvm.lrint.i32.f64(double %Val)
14748      declare i32 @llvm.lrint.i32.f80(float %Val)
14749      declare i32 @llvm.lrint.i32.f128(double %Val)
14750      declare i32 @llvm.lrint.i32.ppcf128(double %Val)
14751
14752      declare i64 @llvm.lrint.i64.f32(float %Val)
14753      declare i64 @llvm.lrint.i64.f64(double %Val)
14754      declare i64 @llvm.lrint.i64.f80(float %Val)
14755      declare i64 @llvm.lrint.i64.f128(double %Val)
14756      declare i64 @llvm.lrint.i64.ppcf128(double %Val)
14757
14758Overview:
14759"""""""""
14760
14761The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
14762integer.
14763
14764
14765Arguments:
14766""""""""""
14767
14768The argument is a floating-point number and the return value is an integer
14769type.
14770
14771Semantics:
14772""""""""""
14773
14774This function returns the same values as the libm ``lrint``
14775functions would, but without setting errno.
14776
14777'``llvm.llrint.*``' Intrinsic
14778^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14779
14780Syntax:
14781"""""""
14782
14783This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
14784floating-point type. Not all targets support all types however.
14785
14786::
14787
14788      declare i64 @llvm.llrint.i64.f32(float %Val)
14789      declare i64 @llvm.llrint.i64.f64(double %Val)
14790      declare i64 @llvm.llrint.i64.f80(float %Val)
14791      declare i64 @llvm.llrint.i64.f128(double %Val)
14792      declare i64 @llvm.llrint.i64.ppcf128(double %Val)
14793
14794Overview:
14795"""""""""
14796
14797The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
14798integer.
14799
14800Arguments:
14801""""""""""
14802
14803The argument is a floating-point number and the return value is an integer
14804type.
14805
14806Semantics:
14807""""""""""
14808
14809This function returns the same values as the libm ``llrint``
14810functions would, but without setting errno.
14811
14812Bit Manipulation Intrinsics
14813---------------------------
14814
14815LLVM provides intrinsics for a few important bit manipulation
14816operations. These allow efficient code generation for some algorithms.
14817
14818'``llvm.bitreverse.*``' Intrinsics
14819^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14820
14821Syntax:
14822"""""""
14823
14824This is an overloaded intrinsic function. You can use bitreverse on any
14825integer type.
14826
14827::
14828
14829      declare i16 @llvm.bitreverse.i16(i16 <id>)
14830      declare i32 @llvm.bitreverse.i32(i32 <id>)
14831      declare i64 @llvm.bitreverse.i64(i64 <id>)
14832      declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
14833
14834Overview:
14835"""""""""
14836
14837The '``llvm.bitreverse``' family of intrinsics is used to reverse the
14838bitpattern of an integer value or vector of integer values; for example
14839``0b10110110`` becomes ``0b01101101``.
14840
14841Semantics:
14842""""""""""
14843
14844The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
14845``M`` in the input moved to bit ``N-M`` in the output. The vector
14846intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
14847basis and the element order is not affected.
14848
14849'``llvm.bswap.*``' Intrinsics
14850^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14851
14852Syntax:
14853"""""""
14854
14855This is an overloaded intrinsic function. You can use bswap on any
14856integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
14857
14858::
14859
14860      declare i16 @llvm.bswap.i16(i16 <id>)
14861      declare i32 @llvm.bswap.i32(i32 <id>)
14862      declare i64 @llvm.bswap.i64(i64 <id>)
14863      declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
14864
14865Overview:
14866"""""""""
14867
14868The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
14869value or vector of integer values with an even number of bytes (positive
14870multiple of 16 bits).
14871
14872Semantics:
14873""""""""""
14874
14875The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
14876and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
14877intrinsic returns an i32 value that has the four bytes of the input i32
14878swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
14879returned i32 will have its bytes in 3, 2, 1, 0 order. The
14880``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
14881concept to additional even-byte lengths (6 bytes, 8 bytes and more,
14882respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
14883operate on a per-element basis and the element order is not affected.
14884
14885'``llvm.ctpop.*``' Intrinsic
14886^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14887
14888Syntax:
14889"""""""
14890
14891This is an overloaded intrinsic. You can use llvm.ctpop on any integer
14892bit width, or on any vector with integer elements. Not all targets
14893support all bit widths or vector types, however.
14894
14895::
14896
14897      declare i8 @llvm.ctpop.i8(i8  <src>)
14898      declare i16 @llvm.ctpop.i16(i16 <src>)
14899      declare i32 @llvm.ctpop.i32(i32 <src>)
14900      declare i64 @llvm.ctpop.i64(i64 <src>)
14901      declare i256 @llvm.ctpop.i256(i256 <src>)
14902      declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
14903
14904Overview:
14905"""""""""
14906
14907The '``llvm.ctpop``' family of intrinsics counts the number of bits set
14908in a value.
14909
14910Arguments:
14911""""""""""
14912
14913The only argument is the value to be counted. The argument may be of any
14914integer type, or a vector with integer elements. The return type must
14915match the argument type.
14916
14917Semantics:
14918""""""""""
14919
14920The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
14921each element of a vector.
14922
14923'``llvm.ctlz.*``' Intrinsic
14924^^^^^^^^^^^^^^^^^^^^^^^^^^^
14925
14926Syntax:
14927"""""""
14928
14929This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
14930integer bit width, or any vector whose elements are integers. Not all
14931targets support all bit widths or vector types, however.
14932
14933::
14934
14935      declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_undef>)
14936      declare i16  @llvm.ctlz.i16 (i16  <src>, i1 <is_zero_undef>)
14937      declare i32  @llvm.ctlz.i32 (i32  <src>, i1 <is_zero_undef>)
14938      declare i64  @llvm.ctlz.i64 (i64  <src>, i1 <is_zero_undef>)
14939      declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
14940      declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14941
14942Overview:
14943"""""""""
14944
14945The '``llvm.ctlz``' family of intrinsic functions counts the number of
14946leading zeros in a variable.
14947
14948Arguments:
14949""""""""""
14950
14951The first argument is the value to be counted. This argument may be of
14952any integer type, or a vector with integer element type. The return
14953type must match the first argument type.
14954
14955The second argument must be a constant and is a flag to indicate whether
14956the intrinsic should ensure that a zero as the first argument produces a
14957defined result. Historically some architectures did not provide a
14958defined result for zero values as efficiently, and many algorithms are
14959now predicated on avoiding zero-value inputs.
14960
14961Semantics:
14962""""""""""
14963
14964The '``llvm.ctlz``' intrinsic counts the leading (most significant)
14965zeros in a variable, or within each element of the vector. If
14966``src == 0`` then the result is the size in bits of the type of ``src``
14967if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
14968``llvm.ctlz(i32 2) = 30``.
14969
14970'``llvm.cttz.*``' Intrinsic
14971^^^^^^^^^^^^^^^^^^^^^^^^^^^
14972
14973Syntax:
14974"""""""
14975
14976This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
14977integer bit width, or any vector of integer elements. Not all targets
14978support all bit widths or vector types, however.
14979
14980::
14981
14982      declare i8   @llvm.cttz.i8  (i8   <src>, i1 <is_zero_undef>)
14983      declare i16  @llvm.cttz.i16 (i16  <src>, i1 <is_zero_undef>)
14984      declare i32  @llvm.cttz.i32 (i32  <src>, i1 <is_zero_undef>)
14985      declare i64  @llvm.cttz.i64 (i64  <src>, i1 <is_zero_undef>)
14986      declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
14987      declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14988
14989Overview:
14990"""""""""
14991
14992The '``llvm.cttz``' family of intrinsic functions counts the number of
14993trailing zeros.
14994
14995Arguments:
14996""""""""""
14997
14998The first argument is the value to be counted. This argument may be of
14999any integer type, or a vector with integer element type. The return
15000type must match the first argument type.
15001
15002The second argument must be a constant and is a flag to indicate whether
15003the intrinsic should ensure that a zero as the first argument produces a
15004defined result. Historically some architectures did not provide a
15005defined result for zero values as efficiently, and many algorithms are
15006now predicated on avoiding zero-value inputs.
15007
15008Semantics:
15009""""""""""
15010
15011The '``llvm.cttz``' intrinsic counts the trailing (least significant)
15012zeros in a variable, or within each element of a vector. If ``src == 0``
15013then the result is the size in bits of the type of ``src`` if
15014``is_zero_undef == 0`` and ``undef`` otherwise. For example,
15015``llvm.cttz(2) = 1``.
15016
15017.. _int_overflow:
15018
15019'``llvm.fshl.*``' Intrinsic
15020^^^^^^^^^^^^^^^^^^^^^^^^^^^
15021
15022Syntax:
15023"""""""
15024
15025This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
15026integer bit width or any vector of integer elements. Not all targets
15027support all bit widths or vector types, however.
15028
15029::
15030
15031      declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15032      declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
15033      declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15034
15035Overview:
15036"""""""""
15037
15038The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15039the first two values are concatenated as { %a : %b } (%a is the most significant
15040bits of the wide value), the combined value is shifted left, and the most
15041significant bits are extracted to produce a result that is the same size as the
15042original arguments. If the first 2 arguments are identical, this is equivalent
15043to a rotate left operation. For vector types, the operation occurs for each
15044element of the vector. The shift argument is treated as an unsigned amount
15045modulo the element size of the arguments.
15046
15047Arguments:
15048""""""""""
15049
15050The first two arguments are the values to be concatenated. The third
15051argument is the shift amount. The arguments may be any integer type or a
15052vector with integer element type. All arguments and the return value must
15053have the same type.
15054
15055Example:
15056""""""""
15057
15058.. code-block:: text
15059
15060      %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15061      %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
15062      %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
15063      %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
15064
15065'``llvm.fshr.*``' Intrinsic
15066^^^^^^^^^^^^^^^^^^^^^^^^^^^
15067
15068Syntax:
15069"""""""
15070
15071This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15072integer bit width or any vector of integer elements. Not all targets
15073support all bit widths or vector types, however.
15074
15075::
15076
15077      declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15078      declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
15079      declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15080
15081Overview:
15082"""""""""
15083
15084The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15085the first two values are concatenated as { %a : %b } (%a is the most significant
15086bits of the wide value), the combined value is shifted right, and the least
15087significant bits are extracted to produce a result that is the same size as the
15088original arguments. If the first 2 arguments are identical, this is equivalent
15089to a rotate right operation. For vector types, the operation occurs for each
15090element of the vector. The shift argument is treated as an unsigned amount
15091modulo the element size of the arguments.
15092
15093Arguments:
15094""""""""""
15095
15096The first two arguments are the values to be concatenated. The third
15097argument is the shift amount. The arguments may be any integer type or a
15098vector with integer element type. All arguments and the return value must
15099have the same type.
15100
15101Example:
15102""""""""
15103
15104.. code-block:: text
15105
15106      %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15107      %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
15108      %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
15109      %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
15110
15111Arithmetic with Overflow Intrinsics
15112-----------------------------------
15113
15114LLVM provides intrinsics for fast arithmetic overflow checking.
15115
15116Each of these intrinsics returns a two-element struct. The first
15117element of this struct contains the result of the corresponding
15118arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15119the result. Therefore, for example, the first element of the struct
15120returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15121result of a 32-bit ``add`` instruction with the same operands, where
15122the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15123
15124The second element of the result is an ``i1`` that is 1 if the
15125arithmetic operation overflowed and 0 otherwise. An operation
15126overflows if, for any values of its operands ``A`` and ``B`` and for
15127any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15128not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15129``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15130``op`` is the underlying arithmetic operation.
15131
15132The behavior of these intrinsics is well-defined for all argument
15133values.
15134
15135'``llvm.sadd.with.overflow.*``' Intrinsics
15136^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15137
15138Syntax:
15139"""""""
15140
15141This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15142on any integer bit width or vectors of integers.
15143
15144::
15145
15146      declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15147      declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15148      declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15149      declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15150
15151Overview:
15152"""""""""
15153
15154The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15155a signed addition of the two arguments, and indicate whether an overflow
15156occurred during the signed summation.
15157
15158Arguments:
15159""""""""""
15160
15161The arguments (%a and %b) and the first element of the result structure
15162may be of integer types of any bit width, but they must have the same
15163bit width. The second element of the result structure must be of type
15164``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15165addition.
15166
15167Semantics:
15168""""""""""
15169
15170The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15171a signed addition of the two variables. They return a structure --- the
15172first element of which is the signed summation, and the second element
15173of which is a bit specifying if the signed summation resulted in an
15174overflow.
15175
15176Examples:
15177"""""""""
15178
15179.. code-block:: llvm
15180
15181      %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15182      %sum = extractvalue {i32, i1} %res, 0
15183      %obit = extractvalue {i32, i1} %res, 1
15184      br i1 %obit, label %overflow, label %normal
15185
15186'``llvm.uadd.with.overflow.*``' Intrinsics
15187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15188
15189Syntax:
15190"""""""
15191
15192This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15193on any integer bit width or vectors of integers.
15194
15195::
15196
15197      declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15198      declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15199      declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15200      declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15201
15202Overview:
15203"""""""""
15204
15205The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15206an unsigned addition of the two arguments, and indicate whether a carry
15207occurred during the unsigned summation.
15208
15209Arguments:
15210""""""""""
15211
15212The arguments (%a and %b) and the first element of the result structure
15213may be of integer types of any bit width, but they must have the same
15214bit width. The second element of the result structure must be of type
15215``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15216addition.
15217
15218Semantics:
15219""""""""""
15220
15221The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15222an unsigned addition of the two arguments. They return a structure --- the
15223first element of which is the sum, and the second element of which is a
15224bit specifying if the unsigned summation resulted in a carry.
15225
15226Examples:
15227"""""""""
15228
15229.. code-block:: llvm
15230
15231      %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15232      %sum = extractvalue {i32, i1} %res, 0
15233      %obit = extractvalue {i32, i1} %res, 1
15234      br i1 %obit, label %carry, label %normal
15235
15236'``llvm.ssub.with.overflow.*``' Intrinsics
15237^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15238
15239Syntax:
15240"""""""
15241
15242This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15243on any integer bit width or vectors of integers.
15244
15245::
15246
15247      declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15248      declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15249      declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15250      declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15251
15252Overview:
15253"""""""""
15254
15255The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15256a signed subtraction of the two arguments, and indicate whether an
15257overflow occurred during the signed subtraction.
15258
15259Arguments:
15260""""""""""
15261
15262The arguments (%a and %b) and the first element of the result structure
15263may be of integer types of any bit width, but they must have the same
15264bit width. The second element of the result structure must be of type
15265``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15266subtraction.
15267
15268Semantics:
15269""""""""""
15270
15271The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15272a signed subtraction of the two arguments. They return a structure --- the
15273first element of which is the subtraction, and the second element of
15274which is a bit specifying if the signed subtraction resulted in an
15275overflow.
15276
15277Examples:
15278"""""""""
15279
15280.. code-block:: llvm
15281
15282      %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15283      %sum = extractvalue {i32, i1} %res, 0
15284      %obit = extractvalue {i32, i1} %res, 1
15285      br i1 %obit, label %overflow, label %normal
15286
15287'``llvm.usub.with.overflow.*``' Intrinsics
15288^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15289
15290Syntax:
15291"""""""
15292
15293This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
15294on any integer bit width or vectors of integers.
15295
15296::
15297
15298      declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
15299      declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15300      declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
15301      declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15302
15303Overview:
15304"""""""""
15305
15306The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15307an unsigned subtraction of the two arguments, and indicate whether an
15308overflow occurred during the unsigned subtraction.
15309
15310Arguments:
15311""""""""""
15312
15313The arguments (%a and %b) and the first element of the result structure
15314may be of integer types of any bit width, but they must have the same
15315bit width. The second element of the result structure must be of type
15316``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15317subtraction.
15318
15319Semantics:
15320""""""""""
15321
15322The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15323an unsigned subtraction of the two arguments. They return a structure ---
15324the first element of which is the subtraction, and the second element of
15325which is a bit specifying if the unsigned subtraction resulted in an
15326overflow.
15327
15328Examples:
15329"""""""""
15330
15331.. code-block:: llvm
15332
15333      %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15334      %sum = extractvalue {i32, i1} %res, 0
15335      %obit = extractvalue {i32, i1} %res, 1
15336      br i1 %obit, label %overflow, label %normal
15337
15338'``llvm.smul.with.overflow.*``' Intrinsics
15339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15340
15341Syntax:
15342"""""""
15343
15344This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
15345on any integer bit width or vectors of integers.
15346
15347::
15348
15349      declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
15350      declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15351      declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
15352      declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15353
15354Overview:
15355"""""""""
15356
15357The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15358a signed multiplication of the two arguments, and indicate whether an
15359overflow occurred during the signed multiplication.
15360
15361Arguments:
15362""""""""""
15363
15364The arguments (%a and %b) and the first element of the result structure
15365may be of integer types of any bit width, but they must have the same
15366bit width. The second element of the result structure must be of type
15367``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15368multiplication.
15369
15370Semantics:
15371""""""""""
15372
15373The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15374a signed multiplication of the two arguments. They return a structure ---
15375the first element of which is the multiplication, and the second element
15376of which is a bit specifying if the signed multiplication resulted in an
15377overflow.
15378
15379Examples:
15380"""""""""
15381
15382.. code-block:: llvm
15383
15384      %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15385      %sum = extractvalue {i32, i1} %res, 0
15386      %obit = extractvalue {i32, i1} %res, 1
15387      br i1 %obit, label %overflow, label %normal
15388
15389'``llvm.umul.with.overflow.*``' Intrinsics
15390^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15391
15392Syntax:
15393"""""""
15394
15395This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
15396on any integer bit width or vectors of integers.
15397
15398::
15399
15400      declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
15401      declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15402      declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
15403      declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15404
15405Overview:
15406"""""""""
15407
15408The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15409a unsigned multiplication of the two arguments, and indicate whether an
15410overflow occurred during the unsigned multiplication.
15411
15412Arguments:
15413""""""""""
15414
15415The arguments (%a and %b) and the first element of the result structure
15416may be of integer types of any bit width, but they must have the same
15417bit width. The second element of the result structure must be of type
15418``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15419multiplication.
15420
15421Semantics:
15422""""""""""
15423
15424The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15425an unsigned multiplication of the two arguments. They return a structure ---
15426the first element of which is the multiplication, and the second
15427element of which is a bit specifying if the unsigned multiplication
15428resulted in an overflow.
15429
15430Examples:
15431"""""""""
15432
15433.. code-block:: llvm
15434
15435      %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15436      %sum = extractvalue {i32, i1} %res, 0
15437      %obit = extractvalue {i32, i1} %res, 1
15438      br i1 %obit, label %overflow, label %normal
15439
15440Saturation Arithmetic Intrinsics
15441---------------------------------
15442
15443Saturation arithmetic is a version of arithmetic in which operations are
15444limited to a fixed range between a minimum and maximum value. If the result of
15445an operation is greater than the maximum value, the result is set (or
15446"clamped") to this maximum. If it is below the minimum, it is clamped to this
15447minimum.
15448
15449
15450'``llvm.sadd.sat.*``' Intrinsics
15451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15452
15453Syntax
15454"""""""
15455
15456This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
15457on any integer bit width or vectors of integers.
15458
15459::
15460
15461      declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
15462      declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
15463      declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
15464      declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15465
15466Overview
15467"""""""""
15468
15469The '``llvm.sadd.sat``' family of intrinsic functions perform signed
15470saturating addition on the 2 arguments.
15471
15472Arguments
15473""""""""""
15474
15475The arguments (%a and %b) and the result may be of integer types of any bit
15476width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15477values that will undergo signed addition.
15478
15479Semantics:
15480""""""""""
15481
15482The maximum value this operation can clamp to is the largest signed value
15483representable by the bit width of the arguments. The minimum value is the
15484smallest signed value representable by this bit width.
15485
15486
15487Examples
15488"""""""""
15489
15490.. code-block:: llvm
15491
15492      %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
15493      %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
15494      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
15495      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
15496
15497
15498'``llvm.uadd.sat.*``' Intrinsics
15499^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15500
15501Syntax
15502"""""""
15503
15504This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
15505on any integer bit width or vectors of integers.
15506
15507::
15508
15509      declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
15510      declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
15511      declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
15512      declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15513
15514Overview
15515"""""""""
15516
15517The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
15518saturating addition on the 2 arguments.
15519
15520Arguments
15521""""""""""
15522
15523The arguments (%a and %b) and the result may be of integer types of any bit
15524width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15525values that will undergo unsigned addition.
15526
15527Semantics:
15528""""""""""
15529
15530The maximum value this operation can clamp to is the largest unsigned value
15531representable by the bit width of the arguments. Because this is an unsigned
15532operation, the result will never saturate towards zero.
15533
15534
15535Examples
15536"""""""""
15537
15538.. code-block:: llvm
15539
15540      %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
15541      %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
15542      %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
15543
15544
15545'``llvm.ssub.sat.*``' Intrinsics
15546^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15547
15548Syntax
15549"""""""
15550
15551This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
15552on any integer bit width or vectors of integers.
15553
15554::
15555
15556      declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
15557      declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
15558      declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
15559      declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15560
15561Overview
15562"""""""""
15563
15564The '``llvm.ssub.sat``' family of intrinsic functions perform signed
15565saturating subtraction on the 2 arguments.
15566
15567Arguments
15568""""""""""
15569
15570The arguments (%a and %b) and the result may be of integer types of any bit
15571width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15572values that will undergo signed subtraction.
15573
15574Semantics:
15575""""""""""
15576
15577The maximum value this operation can clamp to is the largest signed value
15578representable by the bit width of the arguments. The minimum value is the
15579smallest signed value representable by this bit width.
15580
15581
15582Examples
15583"""""""""
15584
15585.. code-block:: llvm
15586
15587      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
15588      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
15589      %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
15590      %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
15591
15592
15593'``llvm.usub.sat.*``' Intrinsics
15594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15595
15596Syntax
15597"""""""
15598
15599This is an overloaded intrinsic. You can use ``llvm.usub.sat``
15600on any integer bit width or vectors of integers.
15601
15602::
15603
15604      declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
15605      declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
15606      declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
15607      declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15608
15609Overview
15610"""""""""
15611
15612The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
15613saturating subtraction on the 2 arguments.
15614
15615Arguments
15616""""""""""
15617
15618The arguments (%a and %b) and the result may be of integer types of any bit
15619width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15620values that will undergo unsigned subtraction.
15621
15622Semantics:
15623""""""""""
15624
15625The minimum value this operation can clamp to is 0, which is the smallest
15626unsigned value representable by the bit width of the unsigned arguments.
15627Because this is an unsigned operation, the result will never saturate towards
15628the largest possible value representable by this bit width.
15629
15630
15631Examples
15632"""""""""
15633
15634.. code-block:: llvm
15635
15636      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
15637      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
15638
15639
15640'``llvm.sshl.sat.*``' Intrinsics
15641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15642
15643Syntax
15644"""""""
15645
15646This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
15647on integers or vectors of integers of any bit width.
15648
15649::
15650
15651      declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
15652      declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
15653      declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
15654      declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15655
15656Overview
15657"""""""""
15658
15659The '``llvm.sshl.sat``' family of intrinsic functions perform signed
15660saturating left shift on the first argument.
15661
15662Arguments
15663""""""""""
15664
15665The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15666bit width, but they must have the same bit width. ``%a`` is the value to be
15667shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15668dynamically) equal to or larger than the integer bit width of the arguments,
15669the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15670vectors, each vector element of ``a`` is shifted by the corresponding shift
15671amount in ``b``.
15672
15673
15674Semantics:
15675""""""""""
15676
15677The maximum value this operation can clamp to is the largest signed value
15678representable by the bit width of the arguments. The minimum value is the
15679smallest signed value representable by this bit width.
15680
15681
15682Examples
15683"""""""""
15684
15685.. code-block:: llvm
15686
15687      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
15688      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
15689      %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
15690      %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
15691
15692
15693'``llvm.ushl.sat.*``' Intrinsics
15694^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15695
15696Syntax
15697"""""""
15698
15699This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
15700on integers or vectors of integers of any bit width.
15701
15702::
15703
15704      declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
15705      declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
15706      declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
15707      declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15708
15709Overview
15710"""""""""
15711
15712The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
15713saturating left shift on the first argument.
15714
15715Arguments
15716""""""""""
15717
15718The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15719bit width, but they must have the same bit width. ``%a`` is the value to be
15720shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15721dynamically) equal to or larger than the integer bit width of the arguments,
15722the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15723vectors, each vector element of ``a`` is shifted by the corresponding shift
15724amount in ``b``.
15725
15726Semantics:
15727""""""""""
15728
15729The maximum value this operation can clamp to is the largest unsigned value
15730representable by the bit width of the arguments.
15731
15732
15733Examples
15734"""""""""
15735
15736.. code-block:: llvm
15737
15738      %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
15739      %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
15740
15741
15742Fixed Point Arithmetic Intrinsics
15743---------------------------------
15744
15745A fixed point number represents a real data type for a number that has a fixed
15746number of digits after a radix point (equivalent to the decimal point '.').
15747The number of digits after the radix point is referred as the `scale`. These
15748are useful for representing fractional values to a specific precision. The
15749following intrinsics perform fixed point arithmetic operations on 2 operands
15750of the same scale, specified as the third argument.
15751
15752The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
15753of fixed point numbers through scaled integers. Therefore, fixed point
15754multiplication can be represented as
15755
15756.. code-block:: llvm
15757
15758        %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
15759
15760        ; Expands to
15761        %a2 = sext i4 %a to i8
15762        %b2 = sext i4 %b to i8
15763        %mul = mul nsw nuw i8 %a, %b
15764        %scale2 = trunc i32 %scale to i8
15765        %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
15766        %result = trunc i8 %r to i4
15767
15768The ``llvm.*div.fix`` family of intrinsic functions represents a division of
15769fixed point numbers through scaled integers. Fixed point division can be
15770represented as:
15771
15772.. code-block:: llvm
15773
15774        %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
15775
15776        ; Expands to
15777        %a2 = sext i4 %a to i8
15778        %b2 = sext i4 %b to i8
15779        %scale2 = trunc i32 %scale to i8
15780        %a3 = shl i8 %a2, %scale2
15781        %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
15782        %result = trunc i8 %r to i4
15783
15784For each of these functions, if the result cannot be represented exactly with
15785the provided scale, the result is rounded. Rounding is unspecified since
15786preferred rounding may vary for different targets. Rounding is specified
15787through a target hook. Different pipelines should legalize or optimize this
15788using the rounding specified by this hook if it is provided. Operations like
15789constant folding, instruction combining, KnownBits, and ValueTracking should
15790also use this hook, if provided, and not assume the direction of rounding. A
15791rounded result must always be within one unit of precision from the true
15792result. That is, the error between the returned result and the true result must
15793be less than 1/2^(scale).
15794
15795
15796'``llvm.smul.fix.*``' Intrinsics
15797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15798
15799Syntax
15800"""""""
15801
15802This is an overloaded intrinsic. You can use ``llvm.smul.fix``
15803on any integer bit width or vectors of integers.
15804
15805::
15806
15807      declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
15808      declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
15809      declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
15810      declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15811
15812Overview
15813"""""""""
15814
15815The '``llvm.smul.fix``' family of intrinsic functions perform signed
15816fixed point multiplication on 2 arguments of the same scale.
15817
15818Arguments
15819""""""""""
15820
15821The arguments (%a and %b) and the result may be of integer types of any bit
15822width, but they must have the same bit width. The arguments may also work with
15823int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15824values that will undergo signed fixed point multiplication. The argument
15825``%scale`` represents the scale of both operands, and must be a constant
15826integer.
15827
15828Semantics:
15829""""""""""
15830
15831This operation performs fixed point multiplication on the 2 arguments of a
15832specified scale. The result will also be returned in the same scale specified
15833in the third argument.
15834
15835If the result value cannot be precisely represented in the given scale, the
15836value is rounded up or down to the closest representable value. The rounding
15837direction is unspecified.
15838
15839It is undefined behavior if the result value does not fit within the range of
15840the fixed point type.
15841
15842
15843Examples
15844"""""""""
15845
15846.. code-block:: llvm
15847
15848      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15849      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15850      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15851
15852      ; The result in the following could be rounded up to -2 or down to -2.5
15853      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15854
15855
15856'``llvm.umul.fix.*``' Intrinsics
15857^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15858
15859Syntax
15860"""""""
15861
15862This is an overloaded intrinsic. You can use ``llvm.umul.fix``
15863on any integer bit width or vectors of integers.
15864
15865::
15866
15867      declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
15868      declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
15869      declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
15870      declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15871
15872Overview
15873"""""""""
15874
15875The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
15876fixed point multiplication on 2 arguments of the same scale.
15877
15878Arguments
15879""""""""""
15880
15881The arguments (%a and %b) and the result may be of integer types of any bit
15882width, but they must have the same bit width. The arguments may also work with
15883int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15884values that will undergo unsigned fixed point multiplication. The argument
15885``%scale`` represents the scale of both operands, and must be a constant
15886integer.
15887
15888Semantics:
15889""""""""""
15890
15891This operation performs unsigned fixed point multiplication on the 2 arguments of a
15892specified scale. The result will also be returned in the same scale specified
15893in the third argument.
15894
15895If the result value cannot be precisely represented in the given scale, the
15896value is rounded up or down to the closest representable value. The rounding
15897direction is unspecified.
15898
15899It is undefined behavior if the result value does not fit within the range of
15900the fixed point type.
15901
15902
15903Examples
15904"""""""""
15905
15906.. code-block:: llvm
15907
15908      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15909      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15910
15911      ; The result in the following could be rounded down to 3.5 or up to 4
15912      %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
15913
15914
15915'``llvm.smul.fix.sat.*``' Intrinsics
15916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15917
15918Syntax
15919"""""""
15920
15921This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
15922on any integer bit width or vectors of integers.
15923
15924::
15925
15926      declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15927      declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15928      declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15929      declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15930
15931Overview
15932"""""""""
15933
15934The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
15935fixed point saturating multiplication on 2 arguments of the same scale.
15936
15937Arguments
15938""""""""""
15939
15940The arguments (%a and %b) and the result may be of integer types of any bit
15941width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15942values that will undergo signed fixed point multiplication. The argument
15943``%scale`` represents the scale of both operands, and must be a constant
15944integer.
15945
15946Semantics:
15947""""""""""
15948
15949This operation performs fixed point multiplication on the 2 arguments of a
15950specified scale. The result will also be returned in the same scale specified
15951in the third argument.
15952
15953If the result value cannot be precisely represented in the given scale, the
15954value is rounded up or down to the closest representable value. The rounding
15955direction is unspecified.
15956
15957The maximum value this operation can clamp to is the largest signed value
15958representable by the bit width of the first 2 arguments. The minimum value is the
15959smallest signed value representable by this bit width.
15960
15961
15962Examples
15963"""""""""
15964
15965.. code-block:: llvm
15966
15967      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15968      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15969      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15970
15971      ; The result in the following could be rounded up to -2 or down to -2.5
15972      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15973
15974      ; Saturation
15975      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
15976      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
15977      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
15978      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
15979
15980      ; Scale can affect the saturation result
15981      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
15982      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
15983
15984
15985'``llvm.umul.fix.sat.*``' Intrinsics
15986^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15987
15988Syntax
15989"""""""
15990
15991This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
15992on any integer bit width or vectors of integers.
15993
15994::
15995
15996      declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15997      declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15998      declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15999      declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16000
16001Overview
16002"""""""""
16003
16004The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
16005fixed point saturating multiplication on 2 arguments of the same scale.
16006
16007Arguments
16008""""""""""
16009
16010The arguments (%a and %b) and the result may be of integer types of any bit
16011width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16012values that will undergo unsigned fixed point multiplication. The argument
16013``%scale`` represents the scale of both operands, and must be a constant
16014integer.
16015
16016Semantics:
16017""""""""""
16018
16019This operation performs fixed point multiplication on the 2 arguments of a
16020specified scale. The result will also be returned in the same scale specified
16021in the third argument.
16022
16023If the result value cannot be precisely represented in the given scale, the
16024value is rounded up or down to the closest representable value. The rounding
16025direction is unspecified.
16026
16027The maximum value this operation can clamp to is the largest unsigned value
16028representable by the bit width of the first 2 arguments. The minimum value is the
16029smallest unsigned value representable by this bit width (zero).
16030
16031
16032Examples
16033"""""""""
16034
16035.. code-block:: llvm
16036
16037      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16038      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16039
16040      ; The result in the following could be rounded down to 2 or up to 2.5
16041      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16042
16043      ; Saturation
16044      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
16045      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
16046
16047      ; Scale can affect the saturation result
16048      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16049      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16050
16051
16052'``llvm.sdiv.fix.*``' Intrinsics
16053^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16054
16055Syntax
16056"""""""
16057
16058This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16059on any integer bit width or vectors of integers.
16060
16061::
16062
16063      declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16064      declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16065      declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16066      declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16067
16068Overview
16069"""""""""
16070
16071The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16072fixed point division on 2 arguments of the same scale.
16073
16074Arguments
16075""""""""""
16076
16077The arguments (%a and %b) and the result may be of integer types of any bit
16078width, but they must have the same bit width. The arguments may also work with
16079int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16080values that will undergo signed fixed point division. The argument
16081``%scale`` represents the scale of both operands, and must be a constant
16082integer.
16083
16084Semantics:
16085""""""""""
16086
16087This operation performs fixed point division on the 2 arguments of a
16088specified scale. The result will also be returned in the same scale specified
16089in the third argument.
16090
16091If the result value cannot be precisely represented in the given scale, the
16092value is rounded up or down to the closest representable value. The rounding
16093direction is unspecified.
16094
16095It is undefined behavior if the result value does not fit within the range of
16096the fixed point type, or if the second argument is zero.
16097
16098
16099Examples
16100"""""""""
16101
16102.. code-block:: llvm
16103
16104      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16105      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16106      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16107
16108      ; The result in the following could be rounded up to 1 or down to 0.5
16109      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16110
16111
16112'``llvm.udiv.fix.*``' Intrinsics
16113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16114
16115Syntax
16116"""""""
16117
16118This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16119on any integer bit width or vectors of integers.
16120
16121::
16122
16123      declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16124      declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16125      declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16126      declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16127
16128Overview
16129"""""""""
16130
16131The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16132fixed point division on 2 arguments of the same scale.
16133
16134Arguments
16135""""""""""
16136
16137The arguments (%a and %b) and the result may be of integer types of any bit
16138width, but they must have the same bit width. The arguments may also work with
16139int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16140values that will undergo unsigned fixed point division. The argument
16141``%scale`` represents the scale of both operands, and must be a constant
16142integer.
16143
16144Semantics:
16145""""""""""
16146
16147This operation performs fixed point division on the 2 arguments of a
16148specified scale. The result will also be returned in the same scale specified
16149in the third argument.
16150
16151If the result value cannot be precisely represented in the given scale, the
16152value is rounded up or down to the closest representable value. The rounding
16153direction is unspecified.
16154
16155It is undefined behavior if the result value does not fit within the range of
16156the fixed point type, or if the second argument is zero.
16157
16158
16159Examples
16160"""""""""
16161
16162.. code-block:: llvm
16163
16164      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16165      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16166      %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16167
16168      ; The result in the following could be rounded up to 1 or down to 0.5
16169      %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16170
16171
16172'``llvm.sdiv.fix.sat.*``' Intrinsics
16173^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16174
16175Syntax
16176"""""""
16177
16178This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16179on any integer bit width or vectors of integers.
16180
16181::
16182
16183      declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16184      declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16185      declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16186      declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16187
16188Overview
16189"""""""""
16190
16191The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16192fixed point saturating division on 2 arguments of the same scale.
16193
16194Arguments
16195""""""""""
16196
16197The arguments (%a and %b) and the result may be of integer types of any bit
16198width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16199values that will undergo signed fixed point division. The argument
16200``%scale`` represents the scale of both operands, and must be a constant
16201integer.
16202
16203Semantics:
16204""""""""""
16205
16206This operation performs fixed point division on the 2 arguments of a
16207specified scale. The result will also be returned in the same scale specified
16208in the third argument.
16209
16210If the result value cannot be precisely represented in the given scale, the
16211value is rounded up or down to the closest representable value. The rounding
16212direction is unspecified.
16213
16214The maximum value this operation can clamp to is the largest signed value
16215representable by the bit width of the first 2 arguments. The minimum value is the
16216smallest signed value representable by this bit width.
16217
16218It is undefined behavior if the second argument is zero.
16219
16220
16221Examples
16222"""""""""
16223
16224.. code-block:: llvm
16225
16226      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16227      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16228      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16229
16230      ; The result in the following could be rounded up to 1 or down to 0.5
16231      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16232
16233      ; Saturation
16234      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
16235      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
16236      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
16237
16238
16239'``llvm.udiv.fix.sat.*``' Intrinsics
16240^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16241
16242Syntax
16243"""""""
16244
16245This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16246on any integer bit width or vectors of integers.
16247
16248::
16249
16250      declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16251      declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16252      declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16253      declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16254
16255Overview
16256"""""""""
16257
16258The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16259fixed point saturating division on 2 arguments of the same scale.
16260
16261Arguments
16262""""""""""
16263
16264The arguments (%a and %b) and the result may be of integer types of any bit
16265width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16266values that will undergo unsigned fixed point division. The argument
16267``%scale`` represents the scale of both operands, and must be a constant
16268integer.
16269
16270Semantics:
16271""""""""""
16272
16273This operation performs fixed point division on the 2 arguments of a
16274specified scale. The result will also be returned in the same scale specified
16275in the third argument.
16276
16277If the result value cannot be precisely represented in the given scale, the
16278value is rounded up or down to the closest representable value. The rounding
16279direction is unspecified.
16280
16281The maximum value this operation can clamp to is the largest unsigned value
16282representable by the bit width of the first 2 arguments. The minimum value is the
16283smallest unsigned value representable by this bit width (zero).
16284
16285It is undefined behavior if the second argument is zero.
16286
16287Examples
16288"""""""""
16289
16290.. code-block:: llvm
16291
16292      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16293      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16294
16295      ; The result in the following could be rounded down to 0.5 or up to 1
16296      %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
16297
16298      ; Saturation
16299      %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
16300
16301
16302Specialised Arithmetic Intrinsics
16303---------------------------------
16304
16305.. _i_intr_llvm_canonicalize:
16306
16307'``llvm.canonicalize.*``' Intrinsic
16308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16309
16310Syntax:
16311"""""""
16312
16313::
16314
16315      declare float @llvm.canonicalize.f32(float %a)
16316      declare double @llvm.canonicalize.f64(double %b)
16317
16318Overview:
16319"""""""""
16320
16321The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
16322encoding of a floating-point number. This canonicalization is useful for
16323implementing certain numeric primitives such as frexp. The canonical encoding is
16324defined by IEEE-754-2008 to be:
16325
16326::
16327
16328      2.1.8 canonical encoding: The preferred encoding of a floating-point
16329      representation in a format. Applied to declets, significands of finite
16330      numbers, infinities, and NaNs, especially in decimal formats.
16331
16332This operation can also be considered equivalent to the IEEE-754-2008
16333conversion of a floating-point value to the same format. NaNs are handled
16334according to section 6.2.
16335
16336Examples of non-canonical encodings:
16337
16338- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
16339  converted to a canonical representation per hardware-specific protocol.
16340- Many normal decimal floating-point numbers have non-canonical alternative
16341  encodings.
16342- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
16343  These are treated as non-canonical encodings of zero and will be flushed to
16344  a zero of the same sign by this operation.
16345
16346Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
16347default exception handling must signal an invalid exception, and produce a
16348quiet NaN result.
16349
16350This function should always be implementable as multiplication by 1.0, provided
16351that the compiler does not constant fold the operation. Likewise, division by
163521.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
16353-0.0 is also sufficient provided that the rounding mode is not -Infinity.
16354
16355``@llvm.canonicalize`` must preserve the equality relation. That is:
16356
16357- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
16358- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
16359  to ``(x == y)``
16360
16361Additionally, the sign of zero must be conserved:
16362``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
16363
16364The payload bits of a NaN must be conserved, with two exceptions.
16365First, environments which use only a single canonical representation of NaN
16366must perform said canonicalization. Second, SNaNs must be quieted per the
16367usual methods.
16368
16369The canonicalization operation may be optimized away if:
16370
16371- The input is known to be canonical. For example, it was produced by a
16372  floating-point operation that is required by the standard to be canonical.
16373- The result is consumed only by (or fused with) other floating-point
16374  operations. That is, the bits of the floating-point value are not examined.
16375
16376'``llvm.fmuladd.*``' Intrinsic
16377^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16378
16379Syntax:
16380"""""""
16381
16382::
16383
16384      declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
16385      declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
16386
16387Overview:
16388"""""""""
16389
16390The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
16391expressions that can be fused if the code generator determines that (a) the
16392target instruction set has support for a fused operation, and (b) that the
16393fused operation is more efficient than the equivalent, separate pair of mul
16394and add instructions.
16395
16396Arguments:
16397""""""""""
16398
16399The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
16400multiplicands, a and b, and an addend c.
16401
16402Semantics:
16403""""""""""
16404
16405The expression:
16406
16407::
16408
16409      %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
16410
16411is equivalent to the expression a \* b + c, except that it is unspecified
16412whether rounding will be performed between the multiplication and addition
16413steps. Fusion is not guaranteed, even if the target platform supports it.
16414If a fused multiply-add is required, the corresponding
16415:ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
16416This never sets errno, just as '``llvm.fma.*``'.
16417
16418Examples:
16419"""""""""
16420
16421.. code-block:: llvm
16422
16423      %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
16424
16425
16426Hardware-Loop Intrinsics
16427------------------------
16428
16429LLVM support several intrinsics to mark a loop as a hardware-loop. They are
16430hints to the backend which are required to lower these intrinsics further to target
16431specific instructions, or revert the hardware-loop to a normal loop if target
16432specific restriction are not met and a hardware-loop can't be generated.
16433
16434These intrinsics may be modified in the future and are not intended to be used
16435outside the backend. Thus, front-end and mid-level optimizations should not be
16436generating these intrinsics.
16437
16438
16439'``llvm.set.loop.iterations.*``' Intrinsic
16440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16441
16442Syntax:
16443"""""""
16444
16445This is an overloaded intrinsic.
16446
16447::
16448
16449      declare void @llvm.set.loop.iterations.i32(i32)
16450      declare void @llvm.set.loop.iterations.i64(i64)
16451
16452Overview:
16453"""""""""
16454
16455The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
16456hardware-loop trip count. They are placed in the loop preheader basic block and
16457are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
16458instructions.
16459
16460Arguments:
16461""""""""""
16462
16463The integer operand is the loop trip count of the hardware-loop, and thus
16464not e.g. the loop back-edge taken count.
16465
16466Semantics:
16467""""""""""
16468
16469The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
16470on their operand. It's a hint to the backend that can use this to set up the
16471hardware-loop count with a target specific instruction, usually a move of this
16472value to a special register or a hardware-loop instruction.
16473
16474
16475'``llvm.start.loop.iterations.*``' Intrinsic
16476^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16477
16478Syntax:
16479"""""""
16480
16481This is an overloaded intrinsic.
16482
16483::
16484
16485      declare i32 @llvm.start.loop.iterations.i32(i32)
16486      declare i64 @llvm.start.loop.iterations.i64(i64)
16487
16488Overview:
16489"""""""""
16490
16491The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
16492'``llvm.set.loop.iterations.*``' intrinsics, used to specify the
16493hardware-loop trip count but also produce a value identical to the input
16494that can be used as the input to the loop. They are placed in the loop
16495preheader basic block and the output is expected to be the input to the
16496phi for the induction variable of the loop, decremented by the
16497'``llvm.loop.decrement.reg.*``'.
16498
16499Arguments:
16500""""""""""
16501
16502The integer operand is the loop trip count of the hardware-loop, and thus
16503not e.g. the loop back-edge taken count.
16504
16505Semantics:
16506""""""""""
16507
16508The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
16509on their operand. It's a hint to the backend that can use this to set up the
16510hardware-loop count with a target specific instruction, usually a move of this
16511value to a special register or a hardware-loop instruction.
16512
16513'``llvm.test.set.loop.iterations.*``' Intrinsic
16514^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16515
16516Syntax:
16517"""""""
16518
16519This is an overloaded intrinsic.
16520
16521::
16522
16523      declare i1 @llvm.test.set.loop.iterations.i32(i32)
16524      declare i1 @llvm.test.set.loop.iterations.i64(i64)
16525
16526Overview:
16527"""""""""
16528
16529The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
16530the loop trip count, and also test that the given count is not zero, allowing
16531it to control entry to a while-loop.  They are placed in the loop preheader's
16532predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
16533optimizers duplicating these instructions.
16534
16535Arguments:
16536""""""""""
16537
16538The integer operand is the loop trip count of the hardware-loop, and thus
16539not e.g. the loop back-edge taken count.
16540
16541Semantics:
16542""""""""""
16543
16544The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
16545arithmetic on their operand. It's a hint to the backend that can use this to
16546set up the hardware-loop count with a target specific instruction, usually a
16547move of this value to a special register or a hardware-loop instruction.
16548The result is the conditional value of whether the given count is not zero.
16549
16550
16551'``llvm.test.start.loop.iterations.*``' Intrinsic
16552^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16553
16554Syntax:
16555"""""""
16556
16557This is an overloaded intrinsic.
16558
16559::
16560
16561      declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
16562      declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
16563
16564Overview:
16565"""""""""
16566
16567The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
16568'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
16569intrinsics, used to specify the hardware-loop trip count, but also produce a
16570value identical to the input that can be used as the input to the loop. The
16571second i1 output controls entry to a while-loop.
16572
16573Arguments:
16574""""""""""
16575
16576The integer operand is the loop trip count of the hardware-loop, and thus
16577not e.g. the loop back-edge taken count.
16578
16579Semantics:
16580""""""""""
16581
16582The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
16583arithmetic on their operand. It's a hint to the backend that can use this to
16584set up the hardware-loop count with a target specific instruction, usually a
16585move of this value to a special register or a hardware-loop instruction.
16586The result is a pair of the input and a conditional value of whether the
16587given count is not zero.
16588
16589
16590'``llvm.loop.decrement.reg.*``' Intrinsic
16591^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16592
16593Syntax:
16594"""""""
16595
16596This is an overloaded intrinsic.
16597
16598::
16599
16600      declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
16601      declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
16602
16603Overview:
16604"""""""""
16605
16606The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
16607iteration counter and return an updated value that will be used in the next
16608loop test check.
16609
16610Arguments:
16611""""""""""
16612
16613Both arguments must have identical integer types. The first operand is the
16614loop iteration counter. The second operand is the maximum number of elements
16615processed in an iteration.
16616
16617Semantics:
16618""""""""""
16619
16620The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
16621two operands, which is not allowed to wrap. They return the remaining number of
16622iterations still to be executed, and can be used together with a ``PHI``,
16623``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
16624optimisations are allowed to treat it is a ``SUB``, and it is supported by
16625SCEV, so it's the backends responsibility to handle cases where it may be
16626optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
16627optimizers duplicating these instructions.
16628
16629
16630'``llvm.loop.decrement.*``' Intrinsic
16631^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16632
16633Syntax:
16634"""""""
16635
16636This is an overloaded intrinsic.
16637
16638::
16639
16640      declare i1 @llvm.loop.decrement.i32(i32)
16641      declare i1 @llvm.loop.decrement.i64(i64)
16642
16643Overview:
16644"""""""""
16645
16646The HardwareLoops pass allows the loop decrement value to be specified with an
16647option. It defaults to a loop decrement value of 1, but it can be an unsigned
16648integer value provided by this option.  The '``llvm.loop.decrement.*``'
16649intrinsics decrement the loop iteration counter with this value, and return a
16650false predicate if the loop should exit, and true otherwise.
16651This is emitted if the loop counter is not updated via a ``PHI`` node, which
16652can also be controlled with an option.
16653
16654Arguments:
16655""""""""""
16656
16657The integer argument is the loop decrement value used to decrement the loop
16658iteration counter.
16659
16660Semantics:
16661""""""""""
16662
16663The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
16664counter with the given loop decrement value, and return false if the loop
16665should exit, this ``SUB`` is not allowed to wrap. The result is a condition
16666that is used by the conditional branch controlling the loop.
16667
16668
16669Vector Reduction Intrinsics
16670---------------------------
16671
16672Horizontal reductions of vectors can be expressed using the following
16673intrinsics. Each one takes a vector operand as an input and applies its
16674respective operation across all elements of the vector, returning a single
16675scalar result of the same element type.
16676
16677.. _int_vector_reduce_add:
16678
16679'``llvm.vector.reduce.add.*``' Intrinsic
16680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16681
16682Syntax:
16683"""""""
16684
16685::
16686
16687      declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
16688      declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
16689
16690Overview:
16691"""""""""
16692
16693The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
16694reduction of a vector, returning the result as a scalar. The return type matches
16695the element-type of the vector input.
16696
16697Arguments:
16698""""""""""
16699The argument to this intrinsic must be a vector of integer values.
16700
16701.. _int_vector_reduce_fadd:
16702
16703'``llvm.vector.reduce.fadd.*``' Intrinsic
16704^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16705
16706Syntax:
16707"""""""
16708
16709::
16710
16711      declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
16712      declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
16713
16714Overview:
16715"""""""""
16716
16717The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
16718``ADD`` reduction of a vector, returning the result as a scalar. The return type
16719matches the element-type of the vector input.
16720
16721If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16722preserve the associativity of an equivalent scalarized counterpart. Otherwise
16723the reduction will be *sequential*, thus implying that the operation respects
16724the associativity of a scalarized reduction. That is, the reduction begins with
16725the start value and performs an fadd operation with consecutively increasing
16726vector element indices. See the following pseudocode:
16727
16728::
16729
16730    float sequential_fadd(start_value, input_vector)
16731      result = start_value
16732      for i = 0 to length(input_vector)
16733        result = result + input_vector[i]
16734      return result
16735
16736
16737Arguments:
16738""""""""""
16739The first argument to this intrinsic is a scalar start value for the reduction.
16740The type of the start value matches the element-type of the vector input.
16741The second argument must be a vector of floating-point values.
16742
16743To ignore the start value, negative zero (``-0.0``) can be used, as it is
16744the neutral value of floating point addition.
16745
16746Examples:
16747"""""""""
16748
16749::
16750
16751      %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
16752      %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16753
16754
16755.. _int_vector_reduce_mul:
16756
16757'``llvm.vector.reduce.mul.*``' Intrinsic
16758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16759
16760Syntax:
16761"""""""
16762
16763::
16764
16765      declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
16766      declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
16767
16768Overview:
16769"""""""""
16770
16771The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
16772reduction of a vector, returning the result as a scalar. The return type matches
16773the element-type of the vector input.
16774
16775Arguments:
16776""""""""""
16777The argument to this intrinsic must be a vector of integer values.
16778
16779.. _int_vector_reduce_fmul:
16780
16781'``llvm.vector.reduce.fmul.*``' Intrinsic
16782^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16783
16784Syntax:
16785"""""""
16786
16787::
16788
16789      declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
16790      declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
16791
16792Overview:
16793"""""""""
16794
16795The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
16796``MUL`` reduction of a vector, returning the result as a scalar. The return type
16797matches the element-type of the vector input.
16798
16799If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16800preserve the associativity of an equivalent scalarized counterpart. Otherwise
16801the reduction will be *sequential*, thus implying that the operation respects
16802the associativity of a scalarized reduction. That is, the reduction begins with
16803the start value and performs an fmul operation with consecutively increasing
16804vector element indices. See the following pseudocode:
16805
16806::
16807
16808    float sequential_fmul(start_value, input_vector)
16809      result = start_value
16810      for i = 0 to length(input_vector)
16811        result = result * input_vector[i]
16812      return result
16813
16814
16815Arguments:
16816""""""""""
16817The first argument to this intrinsic is a scalar start value for the reduction.
16818The type of the start value matches the element-type of the vector input.
16819The second argument must be a vector of floating-point values.
16820
16821To ignore the start value, one (``1.0``) can be used, as it is the neutral
16822value of floating point multiplication.
16823
16824Examples:
16825"""""""""
16826
16827::
16828
16829      %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
16830      %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16831
16832.. _int_vector_reduce_and:
16833
16834'``llvm.vector.reduce.and.*``' Intrinsic
16835^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16836
16837Syntax:
16838"""""""
16839
16840::
16841
16842      declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
16843
16844Overview:
16845"""""""""
16846
16847The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
16848reduction of a vector, returning the result as a scalar. The return type matches
16849the element-type of the vector input.
16850
16851Arguments:
16852""""""""""
16853The argument to this intrinsic must be a vector of integer values.
16854
16855.. _int_vector_reduce_or:
16856
16857'``llvm.vector.reduce.or.*``' Intrinsic
16858^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16859
16860Syntax:
16861"""""""
16862
16863::
16864
16865      declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
16866
16867Overview:
16868"""""""""
16869
16870The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
16871of a vector, returning the result as a scalar. The return type matches the
16872element-type of the vector input.
16873
16874Arguments:
16875""""""""""
16876The argument to this intrinsic must be a vector of integer values.
16877
16878.. _int_vector_reduce_xor:
16879
16880'``llvm.vector.reduce.xor.*``' Intrinsic
16881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16882
16883Syntax:
16884"""""""
16885
16886::
16887
16888      declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
16889
16890Overview:
16891"""""""""
16892
16893The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
16894reduction of a vector, returning the result as a scalar. The return type matches
16895the element-type of the vector input.
16896
16897Arguments:
16898""""""""""
16899The argument to this intrinsic must be a vector of integer values.
16900
16901.. _int_vector_reduce_smax:
16902
16903'``llvm.vector.reduce.smax.*``' Intrinsic
16904^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16905
16906Syntax:
16907"""""""
16908
16909::
16910
16911      declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
16912
16913Overview:
16914"""""""""
16915
16916The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
16917``MAX`` reduction of a vector, returning the result as a scalar. The return type
16918matches the element-type of the vector input.
16919
16920Arguments:
16921""""""""""
16922The argument to this intrinsic must be a vector of integer values.
16923
16924.. _int_vector_reduce_smin:
16925
16926'``llvm.vector.reduce.smin.*``' Intrinsic
16927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16928
16929Syntax:
16930"""""""
16931
16932::
16933
16934      declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
16935
16936Overview:
16937"""""""""
16938
16939The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
16940``MIN`` reduction of a vector, returning the result as a scalar. The return type
16941matches the element-type of the vector input.
16942
16943Arguments:
16944""""""""""
16945The argument to this intrinsic must be a vector of integer values.
16946
16947.. _int_vector_reduce_umax:
16948
16949'``llvm.vector.reduce.umax.*``' Intrinsic
16950^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16951
16952Syntax:
16953"""""""
16954
16955::
16956
16957      declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
16958
16959Overview:
16960"""""""""
16961
16962The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
16963integer ``MAX`` reduction of a vector, returning the result as a scalar. The
16964return type matches the element-type of the vector input.
16965
16966Arguments:
16967""""""""""
16968The argument to this intrinsic must be a vector of integer values.
16969
16970.. _int_vector_reduce_umin:
16971
16972'``llvm.vector.reduce.umin.*``' Intrinsic
16973^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16974
16975Syntax:
16976"""""""
16977
16978::
16979
16980      declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
16981
16982Overview:
16983"""""""""
16984
16985The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
16986integer ``MIN`` reduction of a vector, returning the result as a scalar. The
16987return type matches the element-type of the vector input.
16988
16989Arguments:
16990""""""""""
16991The argument to this intrinsic must be a vector of integer values.
16992
16993.. _int_vector_reduce_fmax:
16994
16995'``llvm.vector.reduce.fmax.*``' Intrinsic
16996^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16997
16998Syntax:
16999"""""""
17000
17001::
17002
17003      declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
17004      declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
17005
17006Overview:
17007"""""""""
17008
17009The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
17010``MAX`` reduction of a vector, returning the result as a scalar. The return type
17011matches the element-type of the vector input.
17012
17013This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
17014intrinsic. That is, the result will always be a number unless all elements of
17015the vector are NaN. For a vector with maximum element magnitude 0.0 and
17016containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17017
17018If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17019assume that NaNs are not present in the input vector.
17020
17021Arguments:
17022""""""""""
17023The argument to this intrinsic must be a vector of floating-point values.
17024
17025.. _int_vector_reduce_fmin:
17026
17027'``llvm.vector.reduce.fmin.*``' Intrinsic
17028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17029
17030Syntax:
17031"""""""
17032This is an overloaded intrinsic.
17033
17034::
17035
17036      declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17037      declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17038
17039Overview:
17040"""""""""
17041
17042The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17043``MIN`` reduction of a vector, returning the result as a scalar. The return type
17044matches the element-type of the vector input.
17045
17046This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17047intrinsic. That is, the result will always be a number unless all elements of
17048the vector are NaN. For a vector with minimum element magnitude 0.0 and
17049containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17050
17051If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17052assume that NaNs are not present in the input vector.
17053
17054Arguments:
17055""""""""""
17056The argument to this intrinsic must be a vector of floating-point values.
17057
17058'``llvm.experimental.vector.insert``' Intrinsic
17059^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17060
17061Syntax:
17062"""""""
17063This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert``
17064to insert a fixed-width vector into a scalable vector, but not the other way
17065around.
17066
17067::
17068
17069      declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx)
17070      declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx)
17071
17072Overview:
17073"""""""""
17074
17075The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector
17076starting from a given index. The return type matches the type of the vector we
17077insert into. Conceptually, this can be used to build a scalable vector out of
17078non-scalable vectors.
17079
17080Arguments:
17081""""""""""
17082
17083The ``vec`` is the vector which ``subvec`` will be inserted into.
17084The ``subvec`` is the vector that will be inserted.
17085
17086``idx`` represents the starting element number at which ``subvec`` will be
17087inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17088vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17089the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17090``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17091num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17092cannot be determined statically but is false at runtime, then the result vector
17093is undefined.
17094
17095
17096'``llvm.experimental.vector.extract``' Intrinsic
17097^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17098
17099Syntax:
17100"""""""
17101This is an overloaded intrinsic. You can use
17102``llvm.experimental.vector.extract`` to extract a fixed-width vector from a
17103scalable vector, but not the other way around.
17104
17105::
17106
17107      declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx)
17108      declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx)
17109
17110Overview:
17111"""""""""
17112
17113The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from
17114within another vector starting from a given index. The return type must be
17115explicitly specified. Conceptually, this can be used to decompose a scalable
17116vector into non-scalable parts.
17117
17118Arguments:
17119""""""""""
17120
17121The ``vec`` is the vector from which we will extract a subvector.
17122
17123The ``idx`` specifies the starting element number within ``vec`` from which a
17124subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17125vector length of the result type. If the result type is a scalable vector,
17126``idx`` is first scaled by the result type's runtime scaling factor. Elements
17127``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17128indices. If this condition cannot be determined statically but is false at
17129runtime, then the result vector is undefined. The ``idx`` parameter must be a
17130vector index constant type (for most targets this will be an integer pointer
17131type).
17132
17133'``llvm.experimental.vector.reverse``' Intrinsic
17134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17135
17136Syntax:
17137"""""""
17138This is an overloaded intrinsic.
17139
17140::
17141
17142      declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17143      declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17144
17145Overview:
17146"""""""""
17147
17148The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17149The intrinsic takes a single vector and returns a vector of matching type but
17150with the original lane order reversed. These intrinsics work for both fixed
17151and scalable vectors. While this intrinsic is marked as experimental the
17152recommended way to express reverse operations for fixed-width vectors is still
17153to use a shufflevector, as that may allow for more optimization opportunities.
17154
17155Arguments:
17156""""""""""
17157
17158The argument to this intrinsic must be a vector.
17159
17160'``llvm.experimental.vector.splice``' Intrinsic
17161^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17162
17163Syntax:
17164"""""""
17165This is an overloaded intrinsic.
17166
17167::
17168
17169      declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17170      declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17171
17172Overview:
17173"""""""""
17174
17175The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
17176concatenating elements from the first input vector with elements of the second
17177input vector, returning a vector of the same type as the input vectors. The
17178signed immediate, modulo the number of elements in the vector, is the index
17179into the first vector from which to extract the result value. This means
17180conceptually that for a positive immediate, a vector is extracted from
17181``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
17182immediate, it extracts ``-imm`` trailing elements from the first vector, and
17183the remaining elements from ``%vec2``.
17184
17185These intrinsics work for both fixed and scalable vectors. While this intrinsic
17186is marked as experimental, the recommended way to express this operation for
17187fixed-width vectors is still to use a shufflevector, as that may allow for more
17188optimization opportunities.
17189
17190For example:
17191
17192.. code-block:: text
17193
17194 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1)  ==> <B, C, D, E> ; index
17195 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
17196
17197
17198Arguments:
17199""""""""""
17200
17201The first two operands are vectors with the same type. The third argument
17202``imm`` is the start index, modulo VL, where VL is the runtime vector length of
17203the source/result vector. The ``imm`` is a signed integer constant in the range
17204``-VL <= imm < VL``. For values outside of this range the result is poison.
17205
17206'``llvm.experimental.stepvector``' Intrinsic
17207^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17208
17209This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
17210to generate a vector whose lane values comprise the linear sequence
17211<0, 1, 2, ...>. It is primarily intended for scalable vectors.
17212
17213::
17214
17215      declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
17216      declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
17217
17218The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
17219of integers whose elements contain a linear sequence of values starting from 0
17220with a step of 1.  This experimental intrinsic can only be used for vectors
17221with integer elements that are at least 8 bits in size. If the sequence value
17222exceeds the allowed limit for the element type then the result for that lane is
17223undefined.
17224
17225These intrinsics work for both fixed and scalable vectors. While this intrinsic
17226is marked as experimental, the recommended way to express this operation for
17227fixed-width vectors is still to generate a constant vector instead.
17228
17229
17230Arguments:
17231""""""""""
17232
17233None.
17234
17235
17236Matrix Intrinsics
17237-----------------
17238
17239Operations on matrixes requiring shape information (like number of rows/columns
17240or the memory layout) can be expressed using the matrix intrinsics. These
17241intrinsics require matrix dimensions to be passed as immediate arguments, and
17242matrixes are passed and returned as vectors. This means that for a ``R`` x
17243``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
17244corresponding vector, with indices starting at 0. Currently column-major layout
17245is assumed.  The intrinsics support both integer and floating point matrixes.
17246
17247
17248'``llvm.matrix.transpose.*``' Intrinsic
17249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17250
17251Syntax:
17252"""""""
17253This is an overloaded intrinsic.
17254
17255::
17256
17257      declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
17258
17259Overview:
17260"""""""""
17261
17262The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
17263<Cols>`` matrix and return the transposed matrix in the result vector.
17264
17265Arguments:
17266""""""""""
17267
17268The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17269<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
17270number of rows and columns, respectively, and must be positive, constant
17271integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
17272the same float or integer element type as ``%In``.
17273
17274'``llvm.matrix.multiply.*``' Intrinsic
17275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17276
17277Syntax:
17278"""""""
17279This is an overloaded intrinsic.
17280
17281::
17282
17283      declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
17284
17285Overview:
17286"""""""""
17287
17288The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
17289<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
17290multiplies them. The result matrix is returned in the result vector.
17291
17292Arguments:
17293""""""""""
17294
17295The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
17296<Inner>`` elements, and the second argument ``%B`` to a matrix with
17297``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
17298``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
17299returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
17300Vectors ``%A``, ``%B``, and the returned vector all have the same float or
17301integer element type.
17302
17303
17304'``llvm.matrix.column.major.load.*``' Intrinsic
17305^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17306
17307Syntax:
17308"""""""
17309This is an overloaded intrinsic.
17310
17311::
17312
17313      declare vectorty @llvm.matrix.column.major.load.*(
17314          ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17315
17316Overview:
17317"""""""""
17318
17319The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
17320matrix using a stride of ``%Stride`` to compute the start address of the
17321different columns.  The offset is computed using ``%Stride``'s bitwidth. This
17322allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
17323intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
17324matrix is returned in the result vector. If the ``%Ptr`` argument is known to
17325be aligned to some boundary, this can be specified as an attribute on the
17326argument.
17327
17328Arguments:
17329""""""""""
17330
17331The first argument ``%Ptr`` is a pointer type to the returned vector type, and
17332corresponds to the start address to load from. The second argument ``%Stride``
17333is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
17334to compute the column memory addresses. I.e., for a column ``C``, its start
17335memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
17336``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
17337``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
17338respectively, and must be positive, constant integers. The returned vector must
17339have ``<Rows> * <Cols>`` elements.
17340
17341The :ref:`align <attr_align>` parameter attribute can be provided for the
17342``%Ptr`` arguments.
17343
17344
17345'``llvm.matrix.column.major.store.*``' Intrinsic
17346^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17347
17348Syntax:
17349"""""""
17350
17351::
17352
17353      declare void @llvm.matrix.column.major.store.*(
17354          vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17355
17356Overview:
17357"""""""""
17358
17359The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
17360<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
17361columns. The offset is computed using ``%Stride``'s bitwidth. If
17362``<IsVolatile>`` is true, the intrinsic is considered a
17363:ref:`volatile memory access <volatile>`.
17364
17365If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
17366specified as an attribute on the argument.
17367
17368Arguments:
17369""""""""""
17370
17371The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17372<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
17373pointer to the vector type of ``%In``, and is the start address of the matrix
17374in memory. The third argument ``%Stride`` is a positive, constant integer with
17375``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
17376addresses. I.e., for a column ``C``, its start memory addresses is calculated
17377with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
17378value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
17379and columns, respectively, and must be positive, constant integers.
17380
17381The :ref:`align <attr_align>` parameter attribute can be provided
17382for the ``%Ptr`` arguments.
17383
17384
17385Half Precision Floating-Point Intrinsics
17386----------------------------------------
17387
17388For most target platforms, half precision floating-point is a
17389storage-only format. This means that it is a dense encoding (in memory)
17390but does not support computation in the format.
17391
17392This means that code must first load the half-precision floating-point
17393value as an i16, then convert it to float with
17394:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
17395then be performed on the float value (including extending to double
17396etc). To store the value back to memory, it is first converted to float
17397if needed, then converted to i16 with
17398:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
17399i16 value.
17400
17401.. _int_convert_to_fp16:
17402
17403'``llvm.convert.to.fp16``' Intrinsic
17404^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17405
17406Syntax:
17407"""""""
17408
17409::
17410
17411      declare i16 @llvm.convert.to.fp16.f32(float %a)
17412      declare i16 @llvm.convert.to.fp16.f64(double %a)
17413
17414Overview:
17415"""""""""
17416
17417The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17418conventional floating-point type to half precision floating-point format.
17419
17420Arguments:
17421""""""""""
17422
17423The intrinsic function contains single argument - the value to be
17424converted.
17425
17426Semantics:
17427""""""""""
17428
17429The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17430conventional floating-point format to half precision floating-point format. The
17431return value is an ``i16`` which contains the converted number.
17432
17433Examples:
17434"""""""""
17435
17436.. code-block:: llvm
17437
17438      %res = call i16 @llvm.convert.to.fp16.f32(float %a)
17439      store i16 %res, i16* @x, align 2
17440
17441.. _int_convert_from_fp16:
17442
17443'``llvm.convert.from.fp16``' Intrinsic
17444^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17445
17446Syntax:
17447"""""""
17448
17449::
17450
17451      declare float @llvm.convert.from.fp16.f32(i16 %a)
17452      declare double @llvm.convert.from.fp16.f64(i16 %a)
17453
17454Overview:
17455"""""""""
17456
17457The '``llvm.convert.from.fp16``' intrinsic function performs a
17458conversion from half precision floating-point format to single precision
17459floating-point format.
17460
17461Arguments:
17462""""""""""
17463
17464The intrinsic function contains single argument - the value to be
17465converted.
17466
17467Semantics:
17468""""""""""
17469
17470The '``llvm.convert.from.fp16``' intrinsic function performs a
17471conversion from half single precision floating-point format to single
17472precision floating-point format. The input half-float value is
17473represented by an ``i16`` value.
17474
17475Examples:
17476"""""""""
17477
17478.. code-block:: llvm
17479
17480      %a = load i16, i16* @x, align 2
17481      %res = call float @llvm.convert.from.fp16(i16 %a)
17482
17483Saturating floating-point to integer conversions
17484------------------------------------------------
17485
17486The ``fptoui`` and ``fptosi`` instructions return a
17487:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
17488representable by the result type. These intrinsics provide an alternative
17489conversion, which will saturate towards the smallest and largest representable
17490integer values instead.
17491
17492'``llvm.fptoui.sat.*``' Intrinsic
17493^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17494
17495Syntax:
17496"""""""
17497
17498This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
17499floating-point argument type and any integer result type, or vectors thereof.
17500Not all targets may support all types, however.
17501
17502::
17503
17504      declare i32 @llvm.fptoui.sat.i32.f32(float %f)
17505      declare i19 @llvm.fptoui.sat.i19.f64(double %f)
17506      declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
17507
17508Overview:
17509"""""""""
17510
17511This intrinsic converts the argument into an unsigned integer using saturating
17512semantics.
17513
17514Arguments:
17515""""""""""
17516
17517The argument may be any floating-point or vector of floating-point type. The
17518return value may be any integer or vector of integer type. The number of vector
17519elements in argument and return must be the same.
17520
17521Semantics:
17522""""""""""
17523
17524The conversion to integer is performed subject to the following rules:
17525
17526- If the argument is any NaN, zero is returned.
17527- If the argument is smaller than zero (this includes negative infinity),
17528  zero is returned.
17529- If the argument is larger than the largest representable unsigned integer of
17530  the result type (this includes positive infinity), the largest representable
17531  unsigned integer is returned.
17532- Otherwise, the result of rounding the argument towards zero is returned.
17533
17534Example:
17535""""""""
17536
17537.. code-block:: text
17538
17539      %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9)              ; yields i8: 123
17540      %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7)               ; yields i8:   0
17541      %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
17542      %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
17543
17544'``llvm.fptosi.sat.*``' Intrinsic
17545^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17546
17547Syntax:
17548"""""""
17549
17550This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
17551floating-point argument type and any integer result type, or vectors thereof.
17552Not all targets may support all types, however.
17553
17554::
17555
17556      declare i32 @llvm.fptosi.sat.i32.f32(float %f)
17557      declare i19 @llvm.fptosi.sat.i19.f64(double %f)
17558      declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
17559
17560Overview:
17561"""""""""
17562
17563This intrinsic converts the argument into a signed integer using saturating
17564semantics.
17565
17566Arguments:
17567""""""""""
17568
17569The argument may be any floating-point or vector of floating-point type. The
17570return value may be any integer or vector of integer type. The number of vector
17571elements in argument and return must be the same.
17572
17573Semantics:
17574""""""""""
17575
17576The conversion to integer is performed subject to the following rules:
17577
17578- If the argument is any NaN, zero is returned.
17579- If the argument is smaller than the smallest representable signed integer of
17580  the result type (this includes negative infinity), the smallest
17581  representable signed integer is returned.
17582- If the argument is larger than the largest representable signed integer of
17583  the result type (this includes positive infinity), the largest representable
17584  signed integer is returned.
17585- Otherwise, the result of rounding the argument towards zero is returned.
17586
17587Example:
17588""""""""
17589
17590.. code-block:: text
17591
17592      %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9)               ; yields i8:   23
17593      %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8)             ; yields i8: -128
17594      %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
17595      %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
17596
17597.. _dbg_intrinsics:
17598
17599Debugger Intrinsics
17600-------------------
17601
17602The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
17603prefix), are described in the `LLVM Source Level
17604Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
17605document.
17606
17607Exception Handling Intrinsics
17608-----------------------------
17609
17610The LLVM exception handling intrinsics (which all start with
17611``llvm.eh.`` prefix), are described in the `LLVM Exception
17612Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
17613
17614.. _int_trampoline:
17615
17616Trampoline Intrinsics
17617---------------------
17618
17619These intrinsics make it possible to excise one parameter, marked with
17620the :ref:`nest <nest>` attribute, from a function. The result is a
17621callable function pointer lacking the nest parameter - the caller does
17622not need to provide a value for it. Instead, the value to use is stored
17623in advance in a "trampoline", a block of memory usually allocated on the
17624stack, which also contains code to splice the nest value into the
17625argument list. This is used to implement the GCC nested function address
17626extension.
17627
17628For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
17629then the resulting function pointer has signature ``i32 (i32, i32)*``.
17630It can be created as follows:
17631
17632.. code-block:: llvm
17633
17634      %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
17635      %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
17636      call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
17637      %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
17638      %fp = bitcast i8* %p to i32 (i32, i32)*
17639
17640The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
17641``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
17642
17643.. _int_it:
17644
17645'``llvm.init.trampoline``' Intrinsic
17646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17647
17648Syntax:
17649"""""""
17650
17651::
17652
17653      declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
17654
17655Overview:
17656"""""""""
17657
17658This fills the memory pointed to by ``tramp`` with executable code,
17659turning it into a trampoline.
17660
17661Arguments:
17662""""""""""
17663
17664The ``llvm.init.trampoline`` intrinsic takes three arguments, all
17665pointers. The ``tramp`` argument must point to a sufficiently large and
17666sufficiently aligned block of memory; this memory is written to by the
17667intrinsic. Note that the size and the alignment are target-specific -
17668LLVM currently provides no portable way of determining them, so a
17669front-end that generates this intrinsic needs to have some
17670target-specific knowledge. The ``func`` argument must hold a function
17671bitcast to an ``i8*``.
17672
17673Semantics:
17674""""""""""
17675
17676The block of memory pointed to by ``tramp`` is filled with target
17677dependent code, turning it into a function. Then ``tramp`` needs to be
17678passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
17679be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
17680function's signature is the same as that of ``func`` with any arguments
17681marked with the ``nest`` attribute removed. At most one such ``nest``
17682argument is allowed, and it must be of pointer type. Calling the new
17683function is equivalent to calling ``func`` with the same argument list,
17684but with ``nval`` used for the missing ``nest`` argument. If, after
17685calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
17686modified, then the effect of any later call to the returned function
17687pointer is undefined.
17688
17689.. _int_at:
17690
17691'``llvm.adjust.trampoline``' Intrinsic
17692^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17693
17694Syntax:
17695"""""""
17696
17697::
17698
17699      declare i8* @llvm.adjust.trampoline(i8* <tramp>)
17700
17701Overview:
17702"""""""""
17703
17704This performs any required machine-specific adjustment to the address of
17705a trampoline (passed as ``tramp``).
17706
17707Arguments:
17708""""""""""
17709
17710``tramp`` must point to a block of memory which already has trampoline
17711code filled in by a previous call to
17712:ref:`llvm.init.trampoline <int_it>`.
17713
17714Semantics:
17715""""""""""
17716
17717On some architectures the address of the code to be executed needs to be
17718different than the address where the trampoline is actually stored. This
17719intrinsic returns the executable address corresponding to ``tramp``
17720after performing the required machine specific adjustments. The pointer
17721returned can then be :ref:`bitcast and executed <int_trampoline>`.
17722
17723
17724.. _int_vp:
17725
17726Vector Predication Intrinsics
17727-----------------------------
17728VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
17729operation takes a vector mask and an explicit vector length parameter as in:
17730
17731::
17732
17733      <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
17734
17735The vector mask parameter (%mask) always has a vector of `i1` type, for example
17736`<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
17737is an unsigned integer value.  The explicit vector length parameter (%evl) is in
17738the range:
17739
17740::
17741
17742      0 <= %evl <= W,  where W is the number of vector elements
17743
17744Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
17745length of the vector.
17746
17747The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
17748length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
17749to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
17750calculated with an element-wise AND from %mask and %EVLmask:
17751
17752::
17753
17754      M = %mask AND %EVLmask
17755
17756A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
17757
17758::
17759
17760       A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
17761                       {  undef otherwise
17762
17763Optimization Hint
17764^^^^^^^^^^^^^^^^^
17765
17766Some targets, such as AVX512, do not support the %evl parameter in hardware.
17767The use of an effective %evl is discouraged for those targets.  The function
17768``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
17769has native support for %evl.
17770
17771.. _int_vp_select:
17772
17773'``llvm.vp.select.*``' Intrinsics
17774^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17775
17776Syntax:
17777"""""""
17778This is an overloaded intrinsic.
17779
17780::
17781
17782      declare <16 x i32>  @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
17783      declare <vscale x 4 x i64>  @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i32> <on_true>, <vscale x 4 x i32> <on_false>, i32 <evl>)
17784
17785Overview:
17786"""""""""
17787
17788The '``llvm.vp.select``' intrinsic is used to choose one value based on a
17789condition vector, without IR-level branching.
17790
17791Arguments:
17792""""""""""
17793
17794The first operand is a vector of ``i1`` and indicates the condition.  The
17795second operand is the value that is selected where the condition vector is
17796true.  The third operand is the value that is selected where the condition
17797vector is false.  The vectors must be of the same size.  The fourth operand is
17798the explicit vector length.
17799
17800#. The optional ``fast-math flags`` marker indicates that the select has one or
17801   more :ref:`fast-math flags <fastmath>`. These are optimization hints to
17802   enable otherwise unsafe floating-point optimizations. Fast-math flags are
17803   only valid for selects that return a floating-point scalar or vector type,
17804   or an array (nested to any depth) of floating-point scalar or vector types.
17805
17806Semantics:
17807""""""""""
17808
17809The intrinsic selects lanes from the second and third operand depending on a
17810condition vector.
17811
17812All result lanes at positions greater or equal than ``%evl`` are undefined.
17813For all lanes below ``%evl`` where the condition vector is true the lane is
17814taken from the second operand.  Otherwise, the lane is taken from the third
17815operand.
17816
17817Example:
17818""""""""
17819
17820.. code-block:: llvm
17821
17822      %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
17823
17824      ;;; Expansion.
17825      ;; Any result is legal on lanes at and above %evl.
17826      %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
17827
17828
17829
17830.. _int_vp_add:
17831
17832'``llvm.vp.add.*``' Intrinsics
17833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17834
17835Syntax:
17836"""""""
17837This is an overloaded intrinsic.
17838
17839::
17840
17841      declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17842      declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17843      declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17844
17845Overview:
17846"""""""""
17847
17848Predicated integer addition of two vectors of integers.
17849
17850
17851Arguments:
17852""""""""""
17853
17854The first two operands and the result have the same vector of integer type. The
17855third operand is the vector mask and has the same number of elements as the
17856result vector type. The fourth operand is the explicit vector length of the
17857operation.
17858
17859Semantics:
17860""""""""""
17861
17862The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
17863of the first and second vector operand on each enabled lane.  The result on
17864disabled lanes is undefined.
17865
17866Examples:
17867"""""""""
17868
17869.. code-block:: llvm
17870
17871      %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17872      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17873
17874      %t = add <4 x i32> %a, %b
17875      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17876
17877.. _int_vp_sub:
17878
17879'``llvm.vp.sub.*``' Intrinsics
17880^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17881
17882Syntax:
17883"""""""
17884This is an overloaded intrinsic.
17885
17886::
17887
17888      declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17889      declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17890      declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17891
17892Overview:
17893"""""""""
17894
17895Predicated integer subtraction of two vectors of integers.
17896
17897
17898Arguments:
17899""""""""""
17900
17901The first two operands and the result have the same vector of integer type. The
17902third operand is the vector mask and has the same number of elements as the
17903result vector type. The fourth operand is the explicit vector length of the
17904operation.
17905
17906Semantics:
17907""""""""""
17908
17909The '``llvm.vp.sub``' intrinsic performs integer subtraction
17910(:ref:`sub <i_sub>`)  of the first and second vector operand on each enabled
17911lane. The result on disabled lanes is undefined.
17912
17913Examples:
17914"""""""""
17915
17916.. code-block:: llvm
17917
17918      %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17919      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17920
17921      %t = sub <4 x i32> %a, %b
17922      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17923
17924
17925
17926.. _int_vp_mul:
17927
17928'``llvm.vp.mul.*``' Intrinsics
17929^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17930
17931Syntax:
17932"""""""
17933This is an overloaded intrinsic.
17934
17935::
17936
17937      declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17938      declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17939      declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17940
17941Overview:
17942"""""""""
17943
17944Predicated integer multiplication of two vectors of integers.
17945
17946
17947Arguments:
17948""""""""""
17949
17950The first two operands and the result have the same vector of integer type. The
17951third operand is the vector mask and has the same number of elements as the
17952result vector type. The fourth operand is the explicit vector length of the
17953operation.
17954
17955Semantics:
17956""""""""""
17957The '``llvm.vp.mul``' intrinsic performs integer multiplication
17958(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
17959lane. The result on disabled lanes is undefined.
17960
17961Examples:
17962"""""""""
17963
17964.. code-block:: llvm
17965
17966      %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17967      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17968
17969      %t = mul <4 x i32> %a, %b
17970      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17971
17972
17973.. _int_vp_sdiv:
17974
17975'``llvm.vp.sdiv.*``' Intrinsics
17976^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17977
17978Syntax:
17979"""""""
17980This is an overloaded intrinsic.
17981
17982::
17983
17984      declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17985      declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17986      declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17987
17988Overview:
17989"""""""""
17990
17991Predicated, signed division of two vectors of integers.
17992
17993
17994Arguments:
17995""""""""""
17996
17997The first two operands and the result have the same vector of integer type. The
17998third operand is the vector mask and has the same number of elements as the
17999result vector type. The fourth operand is the explicit vector length of the
18000operation.
18001
18002Semantics:
18003""""""""""
18004
18005The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
18006of the first and second vector operand on each enabled lane.  The result on
18007disabled lanes is undefined.
18008
18009Examples:
18010"""""""""
18011
18012.. code-block:: llvm
18013
18014      %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18015      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18016
18017      %t = sdiv <4 x i32> %a, %b
18018      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18019
18020
18021.. _int_vp_udiv:
18022
18023'``llvm.vp.udiv.*``' Intrinsics
18024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18025
18026Syntax:
18027"""""""
18028This is an overloaded intrinsic.
18029
18030::
18031
18032      declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18033      declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18034      declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18035
18036Overview:
18037"""""""""
18038
18039Predicated, unsigned division of two vectors of integers.
18040
18041
18042Arguments:
18043""""""""""
18044
18045The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
18046
18047Semantics:
18048""""""""""
18049
18050The '``llvm.vp.udiv``' intrinsic performs unsigned division
18051(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
18052lane. The result on disabled lanes is undefined.
18053
18054Examples:
18055"""""""""
18056
18057.. code-block:: llvm
18058
18059      %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18060      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18061
18062      %t = udiv <4 x i32> %a, %b
18063      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18064
18065
18066
18067.. _int_vp_srem:
18068
18069'``llvm.vp.srem.*``' Intrinsics
18070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18071
18072Syntax:
18073"""""""
18074This is an overloaded intrinsic.
18075
18076::
18077
18078      declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18079      declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18080      declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18081
18082Overview:
18083"""""""""
18084
18085Predicated computations of the signed remainder of two integer vectors.
18086
18087
18088Arguments:
18089""""""""""
18090
18091The first two operands and the result have the same vector of integer type. The
18092third operand is the vector mask and has the same number of elements as the
18093result vector type. The fourth operand is the explicit vector length of the
18094operation.
18095
18096Semantics:
18097""""""""""
18098
18099The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18100(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18101lane.  The result on disabled lanes is undefined.
18102
18103Examples:
18104"""""""""
18105
18106.. code-block:: llvm
18107
18108      %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18109      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18110
18111      %t = srem <4 x i32> %a, %b
18112      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18113
18114
18115
18116.. _int_vp_urem:
18117
18118'``llvm.vp.urem.*``' Intrinsics
18119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18120
18121Syntax:
18122"""""""
18123This is an overloaded intrinsic.
18124
18125::
18126
18127      declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18128      declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18129      declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18130
18131Overview:
18132"""""""""
18133
18134Predicated computation of the unsigned remainder of two integer vectors.
18135
18136
18137Arguments:
18138""""""""""
18139
18140The first two operands and the result have the same vector of integer type. The
18141third operand is the vector mask and has the same number of elements as the
18142result vector type. The fourth operand is the explicit vector length of the
18143operation.
18144
18145Semantics:
18146""""""""""
18147
18148The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
18149(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
18150lane.  The result on disabled lanes is undefined.
18151
18152Examples:
18153"""""""""
18154
18155.. code-block:: llvm
18156
18157      %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18158      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18159
18160      %t = urem <4 x i32> %a, %b
18161      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18162
18163
18164.. _int_vp_ashr:
18165
18166'``llvm.vp.ashr.*``' Intrinsics
18167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18168
18169Syntax:
18170"""""""
18171This is an overloaded intrinsic.
18172
18173::
18174
18175      declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18176      declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18177      declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18178
18179Overview:
18180"""""""""
18181
18182Vector-predicated arithmetic right-shift.
18183
18184
18185Arguments:
18186""""""""""
18187
18188The first two operands and the result have the same vector of integer type. The
18189third operand is the vector mask and has the same number of elements as the
18190result vector type. The fourth operand is the explicit vector length of the
18191operation.
18192
18193Semantics:
18194""""""""""
18195
18196The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
18197(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
18198enabled lane. The result on disabled lanes is undefined.
18199
18200Examples:
18201"""""""""
18202
18203.. code-block:: llvm
18204
18205      %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18206      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18207
18208      %t = ashr <4 x i32> %a, %b
18209      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18210
18211
18212.. _int_vp_lshr:
18213
18214
18215'``llvm.vp.lshr.*``' Intrinsics
18216^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18217
18218Syntax:
18219"""""""
18220This is an overloaded intrinsic.
18221
18222::
18223
18224      declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18225      declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18226      declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18227
18228Overview:
18229"""""""""
18230
18231Vector-predicated logical right-shift.
18232
18233
18234Arguments:
18235""""""""""
18236
18237The first two operands and the result have the same vector of integer type. The
18238third operand is the vector mask and has the same number of elements as the
18239result vector type. The fourth operand is the explicit vector length of the
18240operation.
18241
18242Semantics:
18243""""""""""
18244
18245The '``llvm.vp.lshr``' intrinsic computes the logical right shift
18246(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
18247enabled lane. The result on disabled lanes is undefined.
18248
18249Examples:
18250"""""""""
18251
18252.. code-block:: llvm
18253
18254      %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18255      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18256
18257      %t = lshr <4 x i32> %a, %b
18258      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18259
18260
18261.. _int_vp_shl:
18262
18263'``llvm.vp.shl.*``' Intrinsics
18264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18265
18266Syntax:
18267"""""""
18268This is an overloaded intrinsic.
18269
18270::
18271
18272      declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18273      declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18274      declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18275
18276Overview:
18277"""""""""
18278
18279Vector-predicated left shift.
18280
18281
18282Arguments:
18283""""""""""
18284
18285The first two operands and the result have the same vector of integer type. The
18286third operand is the vector mask and has the same number of elements as the
18287result vector type. The fourth operand is the explicit vector length of the
18288operation.
18289
18290Semantics:
18291""""""""""
18292
18293The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
18294the first operand by the second operand on each enabled lane.  The result on
18295disabled lanes is undefined.
18296
18297Examples:
18298"""""""""
18299
18300.. code-block:: llvm
18301
18302      %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18303      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18304
18305      %t = shl <4 x i32> %a, %b
18306      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18307
18308
18309.. _int_vp_or:
18310
18311'``llvm.vp.or.*``' Intrinsics
18312^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18313
18314Syntax:
18315"""""""
18316This is an overloaded intrinsic.
18317
18318::
18319
18320      declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18321      declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18322      declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18323
18324Overview:
18325"""""""""
18326
18327Vector-predicated or.
18328
18329
18330Arguments:
18331""""""""""
18332
18333The first two operands and the result have the same vector of integer type. The
18334third operand is the vector mask and has the same number of elements as the
18335result vector type. The fourth operand is the explicit vector length of the
18336operation.
18337
18338Semantics:
18339""""""""""
18340
18341The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
18342first two operands on each enabled lane.  The result on disabled lanes is
18343undefined.
18344
18345Examples:
18346"""""""""
18347
18348.. code-block:: llvm
18349
18350      %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18351      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18352
18353      %t = or <4 x i32> %a, %b
18354      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18355
18356
18357.. _int_vp_and:
18358
18359'``llvm.vp.and.*``' Intrinsics
18360^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18361
18362Syntax:
18363"""""""
18364This is an overloaded intrinsic.
18365
18366::
18367
18368      declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18369      declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18370      declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18371
18372Overview:
18373"""""""""
18374
18375Vector-predicated and.
18376
18377
18378Arguments:
18379""""""""""
18380
18381The first two operands and the result have the same vector of integer type. The
18382third operand is the vector mask and has the same number of elements as the
18383result vector type. The fourth operand is the explicit vector length of the
18384operation.
18385
18386Semantics:
18387""""""""""
18388
18389The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
18390the first two operands on each enabled lane.  The result on disabled lanes is
18391undefined.
18392
18393Examples:
18394"""""""""
18395
18396.. code-block:: llvm
18397
18398      %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18399      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18400
18401      %t = and <4 x i32> %a, %b
18402      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18403
18404
18405.. _int_vp_xor:
18406
18407'``llvm.vp.xor.*``' Intrinsics
18408^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18409
18410Syntax:
18411"""""""
18412This is an overloaded intrinsic.
18413
18414::
18415
18416      declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18417      declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18418      declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18419
18420Overview:
18421"""""""""
18422
18423Vector-predicated, bitwise xor.
18424
18425
18426Arguments:
18427""""""""""
18428
18429The first two operands and the result have the same vector of integer type. The
18430third operand is the vector mask and has the same number of elements as the
18431result vector type. The fourth operand is the explicit vector length of the
18432operation.
18433
18434Semantics:
18435""""""""""
18436
18437The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
18438the first two operands on each enabled lane.
18439The result on disabled lanes is undefined.
18440
18441Examples:
18442"""""""""
18443
18444.. code-block:: llvm
18445
18446      %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18447      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18448
18449      %t = xor <4 x i32> %a, %b
18450      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18451
18452
18453.. _int_vp_fadd:
18454
18455'``llvm.vp.fadd.*``' Intrinsics
18456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18457
18458Syntax:
18459"""""""
18460This is an overloaded intrinsic.
18461
18462::
18463
18464      declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18465      declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18466      declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18467
18468Overview:
18469"""""""""
18470
18471Predicated floating-point addition of two vectors of floating-point values.
18472
18473
18474Arguments:
18475""""""""""
18476
18477The first two operands and the result have the same vector of floating-point type. The
18478third operand is the vector mask and has the same number of elements as the
18479result vector type. The fourth operand is the explicit vector length of the
18480operation.
18481
18482Semantics:
18483""""""""""
18484
18485The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`add <i_fadd>`)
18486of the first and second vector operand on each enabled lane.  The result on
18487disabled lanes is undefined.  The operation is performed in the default
18488floating-point environment.
18489
18490Examples:
18491"""""""""
18492
18493.. code-block:: llvm
18494
18495      %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18496      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18497
18498      %t = fadd <4 x float> %a, %b
18499      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18500
18501
18502.. _int_vp_fsub:
18503
18504'``llvm.vp.fsub.*``' Intrinsics
18505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18506
18507Syntax:
18508"""""""
18509This is an overloaded intrinsic.
18510
18511::
18512
18513      declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18514      declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18515      declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18516
18517Overview:
18518"""""""""
18519
18520Predicated floating-point subtraction of two vectors of floating-point values.
18521
18522
18523Arguments:
18524""""""""""
18525
18526The first two operands and the result have the same vector of floating-point type. The
18527third operand is the vector mask and has the same number of elements as the
18528result vector type. The fourth operand is the explicit vector length of the
18529operation.
18530
18531Semantics:
18532""""""""""
18533
18534The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`add <i_fsub>`)
18535of the first and second vector operand on each enabled lane.  The result on
18536disabled lanes is undefined.  The operation is performed in the default
18537floating-point environment.
18538
18539Examples:
18540"""""""""
18541
18542.. code-block:: llvm
18543
18544      %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18545      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18546
18547      %t = fsub <4 x float> %a, %b
18548      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18549
18550
18551.. _int_vp_fmul:
18552
18553'``llvm.vp.fmul.*``' Intrinsics
18554^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18555
18556Syntax:
18557"""""""
18558This is an overloaded intrinsic.
18559
18560::
18561
18562      declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18563      declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18564      declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18565
18566Overview:
18567"""""""""
18568
18569Predicated floating-point multiplication of two vectors of floating-point values.
18570
18571
18572Arguments:
18573""""""""""
18574
18575The first two operands and the result have the same vector of floating-point type. The
18576third operand is the vector mask and has the same number of elements as the
18577result vector type. The fourth operand is the explicit vector length of the
18578operation.
18579
18580Semantics:
18581""""""""""
18582
18583The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`add <i_fmul>`)
18584of the first and second vector operand on each enabled lane.  The result on
18585disabled lanes is undefined.  The operation is performed in the default
18586floating-point environment.
18587
18588Examples:
18589"""""""""
18590
18591.. code-block:: llvm
18592
18593      %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18594      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18595
18596      %t = fmul <4 x float> %a, %b
18597      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18598
18599
18600.. _int_vp_fdiv:
18601
18602'``llvm.vp.fdiv.*``' Intrinsics
18603^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18604
18605Syntax:
18606"""""""
18607This is an overloaded intrinsic.
18608
18609::
18610
18611      declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18612      declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18613      declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18614
18615Overview:
18616"""""""""
18617
18618Predicated floating-point division of two vectors of floating-point values.
18619
18620
18621Arguments:
18622""""""""""
18623
18624The first two operands and the result have the same vector of floating-point type. The
18625third operand is the vector mask and has the same number of elements as the
18626result vector type. The fourth operand is the explicit vector length of the
18627operation.
18628
18629Semantics:
18630""""""""""
18631
18632The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`add <i_fdiv>`)
18633of the first and second vector operand on each enabled lane.  The result on
18634disabled lanes is undefined.  The operation is performed in the default
18635floating-point environment.
18636
18637Examples:
18638"""""""""
18639
18640.. code-block:: llvm
18641
18642      %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18643      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18644
18645      %t = fdiv <4 x float> %a, %b
18646      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18647
18648
18649.. _int_vp_frem:
18650
18651'``llvm.vp.frem.*``' Intrinsics
18652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18653
18654Syntax:
18655"""""""
18656This is an overloaded intrinsic.
18657
18658::
18659
18660      declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18661      declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18662      declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18663
18664Overview:
18665"""""""""
18666
18667Predicated floating-point remainder of two vectors of floating-point values.
18668
18669
18670Arguments:
18671""""""""""
18672
18673The first two operands and the result have the same vector of floating-point type. The
18674third operand is the vector mask and has the same number of elements as the
18675result vector type. The fourth operand is the explicit vector length of the
18676operation.
18677
18678Semantics:
18679""""""""""
18680
18681The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`add <i_frem>`)
18682of the first and second vector operand on each enabled lane.  The result on
18683disabled lanes is undefined.  The operation is performed in the default
18684floating-point environment.
18685
18686Examples:
18687"""""""""
18688
18689.. code-block:: llvm
18690
18691      %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18692      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18693
18694      %t = frem <4 x float> %a, %b
18695      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18696
18697
18698
18699.. _int_vp_reduce_add:
18700
18701'``llvm.vp.reduce.add.*``' Intrinsics
18702^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18703
18704Syntax:
18705"""""""
18706This is an overloaded intrinsic.
18707
18708::
18709
18710      declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18711      declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18712
18713Overview:
18714"""""""""
18715
18716Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
18717returning the result as a scalar.
18718
18719Arguments:
18720""""""""""
18721
18722The first operand is the start value of the reduction, which must be a scalar
18723integer type equal to the result type. The second operand is the vector on
18724which the reduction is performed and must be a vector of integer values whose
18725element type is the result/start type. The third operand is the vector mask and
18726is a vector of boolean values with the same number of elements as the vector
18727operand. The fourth operand is the explicit vector length of the operation.
18728
18729Semantics:
18730""""""""""
18731
18732The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
18733(:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
18734``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
18735lanes are treated as containing the neutral value ``0`` (i.e. having no effect
18736on the reduction operation). If the vector length is zero, the result is equal
18737to ``start_value``.
18738
18739To ignore the start value, the neutral value can be used.
18740
18741Examples:
18742"""""""""
18743
18744.. code-block:: llvm
18745
18746      %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18747      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18748      ; are treated as though %mask were false for those lanes.
18749
18750      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
18751      %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
18752      %also.r = add i32 %reduction, %start
18753
18754
18755.. _int_vp_reduce_fadd:
18756
18757'``llvm.vp.reduce.fadd.*``' Intrinsics
18758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18759
18760Syntax:
18761"""""""
18762This is an overloaded intrinsic.
18763
18764::
18765
18766      declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18767      declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18768
18769Overview:
18770"""""""""
18771
18772Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
18773value, returning the result as a scalar.
18774
18775Arguments:
18776""""""""""
18777
18778The first operand is the start value of the reduction, which must be a scalar
18779floating-point type equal to the result type. The second operand is the vector
18780on which the reduction is performed and must be a vector of floating-point
18781values whose element type is the result/start type. The third operand is the
18782vector mask and is a vector of boolean values with the same number of elements
18783as the vector operand. The fourth operand is the explicit vector length of the
18784operation.
18785
18786Semantics:
18787""""""""""
18788
18789The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
18790reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
18791vector operand ``val`` on each enabled lane, adding it to the scalar
18792``start_value``. Disabled lanes are treated as containing the neutral value
18793``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
18794enabled, the resulting value will be equal to ``start_value``.
18795
18796To ignore the start value, the neutral value can be used.
18797
18798See the unpredicated version (:ref:`llvm.vector.reduce.fadd
18799<int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
18800
18801Examples:
18802"""""""""
18803
18804.. code-block:: llvm
18805
18806      %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18807      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18808      ; are treated as though %mask were false for those lanes.
18809
18810      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
18811      %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
18812
18813
18814.. _int_vp_reduce_mul:
18815
18816'``llvm.vp.reduce.mul.*``' Intrinsics
18817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18818
18819Syntax:
18820"""""""
18821This is an overloaded intrinsic.
18822
18823::
18824
18825      declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18826      declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18827
18828Overview:
18829"""""""""
18830
18831Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
18832returning the result as a scalar.
18833
18834
18835Arguments:
18836""""""""""
18837
18838The first operand is the start value of the reduction, which must be a scalar
18839integer type equal to the result type. The second operand is the vector on
18840which the reduction is performed and must be a vector of integer values whose
18841element type is the result/start type. The third operand is the vector mask and
18842is a vector of boolean values with the same number of elements as the vector
18843operand. The fourth operand is the explicit vector length of the operation.
18844
18845Semantics:
18846""""""""""
18847
18848The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
18849(:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
18850on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
18851lanes are treated as containing the neutral value ``1`` (i.e. having no effect
18852on the reduction operation). If the vector length is zero, the result is the
18853start value.
18854
18855To ignore the start value, the neutral value can be used.
18856
18857Examples:
18858"""""""""
18859
18860.. code-block:: llvm
18861
18862      %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18863      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18864      ; are treated as though %mask were false for those lanes.
18865
18866      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
18867      %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
18868      %also.r = mul i32 %reduction, %start
18869
18870.. _int_vp_reduce_fmul:
18871
18872'``llvm.vp.reduce.fmul.*``' Intrinsics
18873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18874
18875Syntax:
18876"""""""
18877This is an overloaded intrinsic.
18878
18879::
18880
18881      declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18882      declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18883
18884Overview:
18885"""""""""
18886
18887Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
18888value, returning the result as a scalar.
18889
18890
18891Arguments:
18892""""""""""
18893
18894The first operand is the start value of the reduction, which must be a scalar
18895floating-point type equal to the result type. The second operand is the vector
18896on which the reduction is performed and must be a vector of floating-point
18897values whose element type is the result/start type. The third operand is the
18898vector mask and is a vector of boolean values with the same number of elements
18899as the vector operand. The fourth operand is the explicit vector length of the
18900operation.
18901
18902Semantics:
18903""""""""""
18904
18905The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
18906reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
18907vector operand ``val`` on each enabled lane, multiplying it by the scalar
18908`start_value``. Disabled lanes are treated as containing the neutral value
18909``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
18910enabled, the resulting value will be equal to the starting value.
18911
18912To ignore the start value, the neutral value can be used.
18913
18914See the unpredicated version (:ref:`llvm.vector.reduce.fmul
18915<int_vector_reduce_fmul>`) for more detail on the semantics.
18916
18917Examples:
18918"""""""""
18919
18920.. code-block:: llvm
18921
18922      %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18923      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18924      ; are treated as though %mask were false for those lanes.
18925
18926      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
18927      %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
18928
18929
18930.. _int_vp_reduce_and:
18931
18932'``llvm.vp.reduce.and.*``' Intrinsics
18933^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18934
18935Syntax:
18936"""""""
18937This is an overloaded intrinsic.
18938
18939::
18940
18941      declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18942      declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18943
18944Overview:
18945"""""""""
18946
18947Predicated integer ``AND`` reduction of a vector and a scalar starting value,
18948returning the result as a scalar.
18949
18950
18951Arguments:
18952""""""""""
18953
18954The first operand is the start value of the reduction, which must be a scalar
18955integer type equal to the result type. The second operand is the vector on
18956which the reduction is performed and must be a vector of integer values whose
18957element type is the result/start type. The third operand is the vector mask and
18958is a vector of boolean values with the same number of elements as the vector
18959operand. The fourth operand is the explicit vector length of the operation.
18960
18961Semantics:
18962""""""""""
18963
18964The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
18965(:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
18966``val`` on each enabled lane, performing an '``and``' of that with with the
18967scalar ``start_value``. Disabled lanes are treated as containing the neutral
18968value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
18969operation). If the vector length is zero, the result is the start value.
18970
18971To ignore the start value, the neutral value can be used.
18972
18973Examples:
18974"""""""""
18975
18976.. code-block:: llvm
18977
18978      %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18979      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18980      ; are treated as though %mask were false for those lanes.
18981
18982      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
18983      %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
18984      %also.r = and i32 %reduction, %start
18985
18986
18987.. _int_vp_reduce_or:
18988
18989'``llvm.vp.reduce.or.*``' Intrinsics
18990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18991
18992Syntax:
18993"""""""
18994This is an overloaded intrinsic.
18995
18996::
18997
18998      declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18999      declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19000
19001Overview:
19002"""""""""
19003
19004Predicated integer ``OR`` reduction of a vector and a scalar starting value,
19005returning the result as a scalar.
19006
19007
19008Arguments:
19009""""""""""
19010
19011The first operand is the start value of the reduction, which must be a scalar
19012integer type equal to the result type. The second operand is the vector on
19013which the reduction is performed and must be a vector of integer values whose
19014element type is the result/start type. The third operand is the vector mask and
19015is a vector of boolean values with the same number of elements as the vector
19016operand. The fourth operand is the explicit vector length of the operation.
19017
19018Semantics:
19019""""""""""
19020
19021The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
19022(:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
19023``val`` on each enabled lane, performing an '``or``' of that with the scalar
19024``start_value``. Disabled lanes are treated as containing the neutral value
19025``0`` (i.e. having no effect on the reduction operation). If the vector length
19026is zero, the result is the start value.
19027
19028To ignore the start value, the neutral value can be used.
19029
19030Examples:
19031"""""""""
19032
19033.. code-block:: llvm
19034
19035      %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19036      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19037      ; are treated as though %mask were false for those lanes.
19038
19039      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19040      %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
19041      %also.r = or i32 %reduction, %start
19042
19043.. _int_vp_reduce_xor:
19044
19045'``llvm.vp.reduce.xor.*``' Intrinsics
19046^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19047
19048Syntax:
19049"""""""
19050This is an overloaded intrinsic.
19051
19052::
19053
19054      declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19055      declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19056
19057Overview:
19058"""""""""
19059
19060Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
19061returning the result as a scalar.
19062
19063
19064Arguments:
19065""""""""""
19066
19067The first operand is the start value of the reduction, which must be a scalar
19068integer type equal to the result type. The second operand is the vector on
19069which the reduction is performed and must be a vector of integer values whose
19070element type is the result/start type. The third operand is the vector mask and
19071is a vector of boolean values with the same number of elements as the vector
19072operand. The fourth operand is the explicit vector length of the operation.
19073
19074Semantics:
19075""""""""""
19076
19077The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
19078(:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
19079``val`` on each enabled lane, performing an '``xor``' of that with the scalar
19080``start_value``. Disabled lanes are treated as containing the neutral value
19081``0`` (i.e. having no effect on the reduction operation). If the vector length
19082is zero, the result is the start value.
19083
19084To ignore the start value, the neutral value can be used.
19085
19086Examples:
19087"""""""""
19088
19089.. code-block:: llvm
19090
19091      %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19092      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19093      ; are treated as though %mask were false for those lanes.
19094
19095      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19096      %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
19097      %also.r = xor i32 %reduction, %start
19098
19099
19100.. _int_vp_reduce_smax:
19101
19102'``llvm.vp.reduce.smax.*``' Intrinsics
19103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19104
19105Syntax:
19106"""""""
19107This is an overloaded intrinsic.
19108
19109::
19110
19111      declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19112      declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19113
19114Overview:
19115"""""""""
19116
19117Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
19118value, returning the result as a scalar.
19119
19120
19121Arguments:
19122""""""""""
19123
19124The first operand is the start value of the reduction, which must be a scalar
19125integer type equal to the result type. The second operand is the vector on
19126which the reduction is performed and must be a vector of integer values whose
19127element type is the result/start type. The third operand is the vector mask and
19128is a vector of boolean values with the same number of elements as the vector
19129operand. The fourth operand is the explicit vector length of the operation.
19130
19131Semantics:
19132""""""""""
19133
19134The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
19135reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
19136vector operand ``val`` on each enabled lane, and taking the maximum of that and
19137the scalar ``start_value``. Disabled lanes are treated as containing the
19138neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
19139If the vector length is zero, the result is the start value.
19140
19141To ignore the start value, the neutral value can be used.
19142
19143Examples:
19144"""""""""
19145
19146.. code-block:: llvm
19147
19148      %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19149      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19150      ; are treated as though %mask were false for those lanes.
19151
19152      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
19153      %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
19154      %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
19155
19156
19157.. _int_vp_reduce_smin:
19158
19159'``llvm.vp.reduce.smin.*``' Intrinsics
19160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19161
19162Syntax:
19163"""""""
19164This is an overloaded intrinsic.
19165
19166::
19167
19168      declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19169      declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19170
19171Overview:
19172"""""""""
19173
19174Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
19175value, returning the result as a scalar.
19176
19177
19178Arguments:
19179""""""""""
19180
19181The first operand is the start value of the reduction, which must be a scalar
19182integer type equal to the result type. The second operand is the vector on
19183which the reduction is performed and must be a vector of integer values whose
19184element type is the result/start type. The third operand is the vector mask and
19185is a vector of boolean values with the same number of elements as the vector
19186operand. The fourth operand is the explicit vector length of the operation.
19187
19188Semantics:
19189""""""""""
19190
19191The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
19192reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
19193vector operand ``val`` on each enabled lane, and taking the minimum of that and
19194the scalar ``start_value``. Disabled lanes are treated as containing the
19195neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
19196If the vector length is zero, the result is the start value.
19197
19198To ignore the start value, the neutral value can be used.
19199
19200Examples:
19201"""""""""
19202
19203.. code-block:: llvm
19204
19205      %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19206      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19207      ; are treated as though %mask were false for those lanes.
19208
19209      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
19210      %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
19211      %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
19212
19213
19214.. _int_vp_reduce_umax:
19215
19216'``llvm.vp.reduce.umax.*``' Intrinsics
19217^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19218
19219Syntax:
19220"""""""
19221This is an overloaded intrinsic.
19222
19223::
19224
19225      declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19226      declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19227
19228Overview:
19229"""""""""
19230
19231Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
19232value, returning the result as a scalar.
19233
19234
19235Arguments:
19236""""""""""
19237
19238The first operand is the start value of the reduction, which must be a scalar
19239integer type equal to the result type. The second operand is the vector on
19240which the reduction is performed and must be a vector of integer values whose
19241element type is the result/start type. The third operand is the vector mask and
19242is a vector of boolean values with the same number of elements as the vector
19243operand. The fourth operand is the explicit vector length of the operation.
19244
19245Semantics:
19246""""""""""
19247
19248The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
19249reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
19250vector operand ``val`` on each enabled lane, and taking the maximum of that and
19251the scalar ``start_value``. Disabled lanes are treated as containing the
19252neutral value ``0`` (i.e. having no effect on the reduction operation). If the
19253vector length is zero, the result is the start value.
19254
19255To ignore the start value, the neutral value can be used.
19256
19257Examples:
19258"""""""""
19259
19260.. code-block:: llvm
19261
19262      %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19263      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19264      ; are treated as though %mask were false for those lanes.
19265
19266      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19267      %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
19268      %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
19269
19270
19271.. _int_vp_reduce_umin:
19272
19273'``llvm.vp.reduce.umin.*``' Intrinsics
19274^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19275
19276Syntax:
19277"""""""
19278This is an overloaded intrinsic.
19279
19280::
19281
19282      declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19283      declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19284
19285Overview:
19286"""""""""
19287
19288Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
19289value, returning the result as a scalar.
19290
19291
19292Arguments:
19293""""""""""
19294
19295The first operand is the start value of the reduction, which must be a scalar
19296integer type equal to the result type. The second operand is the vector on
19297which the reduction is performed and must be a vector of integer values whose
19298element type is the result/start type. The third operand is the vector mask and
19299is a vector of boolean values with the same number of elements as the vector
19300operand. The fourth operand is the explicit vector length of the operation.
19301
19302Semantics:
19303""""""""""
19304
19305The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
19306reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
19307vector operand ``val`` on each enabled lane, taking the minimum of that and the
19308scalar ``start_value``. Disabled lanes are treated as containing the neutral
19309value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19310operation). If the vector length is zero, the result is the start value.
19311
19312To ignore the start value, the neutral value can be used.
19313
19314Examples:
19315"""""""""
19316
19317.. code-block:: llvm
19318
19319      %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19320      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19321      ; are treated as though %mask were false for those lanes.
19322
19323      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19324      %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
19325      %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
19326
19327
19328.. _int_vp_reduce_fmax:
19329
19330'``llvm.vp.reduce.fmax.*``' Intrinsics
19331^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19332
19333Syntax:
19334"""""""
19335This is an overloaded intrinsic.
19336
19337::
19338
19339      declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19340      declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19341
19342Overview:
19343"""""""""
19344
19345Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
19346value, returning the result as a scalar.
19347
19348
19349Arguments:
19350""""""""""
19351
19352The first operand is the start value of the reduction, which must be a scalar
19353floating-point type equal to the result type. The second operand is the vector
19354on which the reduction is performed and must be a vector of floating-point
19355values whose element type is the result/start type. The third operand is the
19356vector mask and is a vector of boolean values with the same number of elements
19357as the vector operand. The fourth operand is the explicit vector length of the
19358operation.
19359
19360Semantics:
19361""""""""""
19362
19363The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
19364reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
19365vector operand ``val`` on each enabled lane, taking the maximum of that and the
19366scalar ``start_value``. Disabled lanes are treated as containing the neutral
19367value (i.e. having no effect on the reduction operation). If the vector length
19368is zero, the result is the start value.
19369
19370The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19371flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
19372both set, then the neutral value is the smallest floating-point value for the
19373result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
19374
19375This instruction has the same comparison semantics as the
19376:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
19377'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
19378unless all elements of the vector and the starting value are ``NaN``. For a
19379vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19380``-0.0`` elements, the sign of the result is unspecified.
19381
19382To ignore the start value, the neutral value can be used.
19383
19384Examples:
19385"""""""""
19386
19387.. code-block:: llvm
19388
19389      %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19390      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19391      ; are treated as though %mask were false for those lanes.
19392
19393      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19394      %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
19395      %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
19396
19397
19398.. _int_vp_reduce_fmin:
19399
19400'``llvm.vp.reduce.fmin.*``' Intrinsics
19401^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19402
19403Syntax:
19404"""""""
19405This is an overloaded intrinsic.
19406
19407::
19408
19409      declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19410      declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19411
19412Overview:
19413"""""""""
19414
19415Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
19416value, returning the result as a scalar.
19417
19418
19419Arguments:
19420""""""""""
19421
19422The first operand is the start value of the reduction, which must be a scalar
19423floating-point type equal to the result type. The second operand is the vector
19424on which the reduction is performed and must be a vector of floating-point
19425values whose element type is the result/start type. The third operand is the
19426vector mask and is a vector of boolean values with the same number of elements
19427as the vector operand. The fourth operand is the explicit vector length of the
19428operation.
19429
19430Semantics:
19431""""""""""
19432
19433The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
19434reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
19435vector operand ``val`` on each enabled lane, taking the minimum of that and the
19436scalar ``start_value``. Disabled lanes are treated as containing the neutral
19437value (i.e. having no effect on the reduction operation). If the vector length
19438is zero, the result is the start value.
19439
19440The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19441flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
19442both set, then the neutral value is the largest floating-point value for the
19443result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
19444
19445This instruction has the same comparison semantics as the
19446:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
19447'``llvm.minnum.*``' intrinsic). That is, the result will always be a number
19448unless all elements of the vector and the starting value are ``NaN``. For a
19449vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19450``-0.0`` elements, the sign of the result is unspecified.
19451
19452To ignore the start value, the neutral value can be used.
19453
19454Examples:
19455"""""""""
19456
19457.. code-block:: llvm
19458
19459      %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19460      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19461      ; are treated as though %mask were false for those lanes.
19462
19463      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19464      %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
19465      %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
19466
19467
19468.. _int_get_active_lane_mask:
19469
19470'``llvm.get.active.lane.mask.*``' Intrinsics
19471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19472
19473Syntax:
19474"""""""
19475This is an overloaded intrinsic.
19476
19477::
19478
19479      declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
19480      declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
19481      declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
19482      declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
19483
19484
19485Overview:
19486"""""""""
19487
19488Create a mask representing active and inactive vector lanes.
19489
19490
19491Arguments:
19492""""""""""
19493
19494Both operands have the same scalar integer type. The result is a vector with
19495the i1 element type.
19496
19497Semantics:
19498""""""""""
19499
19500The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
19501to:
19502
19503::
19504
19505      %m[i] = icmp ult (%base + i), %n
19506
19507where ``%m`` is a vector (mask) of active/inactive lanes with its elements
19508indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
19509``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
19510the unsigned less-than comparison operator.  Overflow cannot occur in
19511``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
19512numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
19513poison value. The above is equivalent to:
19514
19515::
19516
19517      %m = @llvm.get.active.lane.mask(%base, %n)
19518
19519This can, for example, be emitted by the loop vectorizer in which case
19520``%base`` is the first element of the vector induction variable (VIV) and
19521``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
19522less than comparison of VIV with the loop tripcount, producing a mask of
19523true/false values representing active/inactive vector lanes, except if the VIV
19524overflows in which case they return false in the lanes where the VIV overflows.
19525The arguments are scalar types to accommodate scalable vector types, for which
19526it is unknown what the type of the step vector needs to be that enumerate its
19527lanes without overflow.
19528
19529This mask ``%m`` can e.g. be used in masked load/store instructions. These
19530intrinsics provide a hint to the backend. I.e., for a vector loop, the
19531back-edge taken count of the original scalar loop is explicit as the second
19532argument.
19533
19534
19535Examples:
19536"""""""""
19537
19538.. code-block:: llvm
19539
19540      %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
19541      %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef)
19542
19543
19544.. _int_experimental_vp_splice:
19545
19546'``llvm.experimental.vp.splice``' Intrinsic
19547^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19548
19549Syntax:
19550"""""""
19551This is an overloaded intrinsic.
19552
19553::
19554
19555      declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
19556      declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <2 x i1> %mask i32 %evl1, i32 %evl2)
19557
19558Overview:
19559"""""""""
19560
19561The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
19562predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
19563
19564Arguments:
19565""""""""""
19566
19567The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
19568the same type.  The third argument ``imm`` is an immediate signed integer that
19569indicates the offset index.  The fourth argument ``mask`` is a vector mask and
19570has the same number of elements as the result.  The last two arguments ``evl1``
19571and ``evl2`` are unsigned integers indicating the explicit vector lengths of
19572``vec1`` and ``vec2`` respectively.  ``imm``, ``evl1`` and ``evl2`` should
19573respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
19574and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
19575constraints are not satisfied the intrinsic has undefined behaviour.
19576
19577Semantics:
19578""""""""""
19579
19580Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
19581``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
19582window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
19583the concatenated vector. Elements in the result vector beyond ``evl2`` are
19584``undef``.  If ``imm`` is negative the starting index is ``evl1 + imm``.  The result
19585vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
19586negative ``imm``) elements from indices ``[imm..evl1 - 1]``
19587(``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
19588first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
19589``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
19590elements are considered and the remaining are ``undef``.  The lanes in the result
19591vector disabled by ``mask`` are ``undef``.
19592
19593Examples:
19594"""""""""
19595
19596.. code-block:: text
19597
19598 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3)  ==> <B, E, F, undef> ; index
19599 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, undef, undef> ; trailing elements
19600
19601
19602.. _int_mload_mstore:
19603
19604Masked Vector Load and Store Intrinsics
19605---------------------------------------
19606
19607LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
19608
19609.. _int_mload:
19610
19611'``llvm.masked.load.*``' Intrinsics
19612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19613
19614Syntax:
19615"""""""
19616This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
19617
19618::
19619
19620      declare <16 x float>  @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19621      declare <2 x double>  @llvm.masked.load.v2f64.p0v2f64  (<2 x double>* <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19622      ;; The data is a vector of pointers to double
19623      declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64    (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
19624      ;; The data is a vector of function pointers
19625      declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
19626
19627Overview:
19628"""""""""
19629
19630Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19631
19632
19633Arguments:
19634""""""""""
19635
19636The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
19637
19638Semantics:
19639""""""""""
19640
19641The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
19642The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
19643
19644
19645::
19646
19647       %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
19648
19649       ;; The result of the two following instructions is identical aside from potential memory access exception
19650       %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
19651       %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
19652
19653.. _int_mstore:
19654
19655'``llvm.masked.store.*``' Intrinsics
19656^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19657
19658Syntax:
19659"""""""
19660This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
19661
19662::
19663
19664       declare void @llvm.masked.store.v8i32.p0v8i32  (<8  x i32>   <value>, <8  x i32>*   <ptr>, i32 <alignment>,  <8  x i1> <mask>)
19665       declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>,  <16 x i1> <mask>)
19666       ;; The data is a vector of pointers to double
19667       declare void @llvm.masked.store.v8p0f64.p0v8p0f64    (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
19668       ;; The data is a vector of function pointers
19669       declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
19670
19671Overview:
19672"""""""""
19673
19674Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19675
19676Arguments:
19677""""""""""
19678
19679The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19680
19681
19682Semantics:
19683""""""""""
19684
19685The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
19686The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
19687
19688::
19689
19690       call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4,  <16 x i1> %mask)
19691
19692       ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
19693       %oldval = load <16 x float>, <16 x float>* %ptr, align 4
19694       %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
19695       store <16 x float> %res, <16 x float>* %ptr, align 4
19696
19697
19698Masked Vector Gather and Scatter Intrinsics
19699-------------------------------------------
19700
19701LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
19702
19703.. _int_mgather:
19704
19705'``llvm.masked.gather.*``' Intrinsics
19706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19707
19708Syntax:
19709"""""""
19710This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
19711
19712::
19713
19714      declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32   (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19715      declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64     (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19716      declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x float*> <passthru>)
19717
19718Overview:
19719"""""""""
19720
19721Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19722
19723
19724Arguments:
19725""""""""""
19726
19727The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
19728
19729Semantics:
19730""""""""""
19731
19732The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
19733The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
19734
19735
19736::
19737
19738       %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
19739
19740       ;; The gather with all-true mask is equivalent to the following instruction sequence
19741       %ptr0 = extractelement <4 x double*> %ptrs, i32 0
19742       %ptr1 = extractelement <4 x double*> %ptrs, i32 1
19743       %ptr2 = extractelement <4 x double*> %ptrs, i32 2
19744       %ptr3 = extractelement <4 x double*> %ptrs, i32 3
19745
19746       %val0 = load double, double* %ptr0, align 8
19747       %val1 = load double, double* %ptr1, align 8
19748       %val2 = load double, double* %ptr2, align 8
19749       %val3 = load double, double* %ptr3, align 8
19750
19751       %vec0    = insertelement <4 x double>undef, %val0, 0
19752       %vec01   = insertelement <4 x double>%vec0, %val1, 1
19753       %vec012  = insertelement <4 x double>%vec01, %val2, 2
19754       %vec0123 = insertelement <4 x double>%vec012, %val3, 3
19755
19756.. _int_mscatter:
19757
19758'``llvm.masked.scatter.*``' Intrinsics
19759^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19760
19761Syntax:
19762"""""""
19763This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
19764
19765::
19766
19767       declare void @llvm.masked.scatter.v8i32.v8p0i32     (<8 x i32>     <value>, <8 x i32*>     <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
19768       declare void @llvm.masked.scatter.v16f32.v16p1f32   (<16 x float>  <value>, <16 x float addrspace(1)*>  <ptrs>, i32 <alignment>, <16 x i1> <mask>)
19769       declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
19770
19771Overview:
19772"""""""""
19773
19774Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19775
19776Arguments:
19777""""""""""
19778
19779The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19780
19781Semantics:
19782""""""""""
19783
19784The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
19785
19786::
19787
19788       ;; This instruction unconditionally stores data vector in multiple addresses
19789       call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
19790
19791       ;; It is equivalent to a list of scalar stores
19792       %val0 = extractelement <8 x i32> %value, i32 0
19793       %val1 = extractelement <8 x i32> %value, i32 1
19794       ..
19795       %val7 = extractelement <8 x i32> %value, i32 7
19796       %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
19797       %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
19798       ..
19799       %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
19800       ;; Note: the order of the following stores is important when they overlap:
19801       store i32 %val0, i32* %ptr0, align 4
19802       store i32 %val1, i32* %ptr1, align 4
19803       ..
19804       store i32 %val7, i32* %ptr7, align 4
19805
19806
19807Masked Vector Expanding Load and Compressing Store Intrinsics
19808-------------------------------------------------------------
19809
19810LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
19811
19812.. _int_expandload:
19813
19814'``llvm.masked.expandload.*``' Intrinsics
19815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19816
19817Syntax:
19818"""""""
19819This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
19820
19821::
19822
19823      declare <16 x float>  @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
19824      declare <2 x i64>     @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
19825
19826Overview:
19827"""""""""
19828
19829Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
19830
19831
19832Arguments:
19833""""""""""
19834
19835The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
19836
19837Semantics:
19838""""""""""
19839
19840The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
19841
19842.. code-block:: c
19843
19844    // In this loop we load from B and spread the elements into array A.
19845    double *A, B; int *C;
19846    for (int i = 0; i < size; ++i) {
19847      if (C[i] != 0)
19848        A[i] = B[j++];
19849    }
19850
19851
19852.. code-block:: llvm
19853
19854    ; Load several elements from array B and expand them in a vector.
19855    ; The number of loaded elements is equal to the number of '1' elements in the Mask.
19856    %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
19857    ; Store the result in A
19858    call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
19859
19860    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
19861    %MaskI = bitcast <8 x i1> %Mask to i8
19862    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
19863    %MaskI64 = zext i8 %MaskIPopcnt to i64
19864    %BNextInd = add i64 %BInd, %MaskI64
19865
19866
19867Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
19868If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
19869
19870.. _int_compressstore:
19871
19872'``llvm.masked.compressstore.*``' Intrinsics
19873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19874
19875Syntax:
19876"""""""
19877This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
19878
19879::
19880
19881      declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, i32*   <ptr>, <8  x i1> <mask>)
19882      declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
19883
19884Overview:
19885"""""""""
19886
19887Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
19888
19889Arguments:
19890""""""""""
19891
19892The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
19893
19894
19895Semantics:
19896""""""""""
19897
19898The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
19899
19900.. code-block:: c
19901
19902    // In this loop we load elements from A and store them consecutively in B
19903    double *A, B; int *C;
19904    for (int i = 0; i < size; ++i) {
19905      if (C[i] != 0)
19906        B[j++] = A[i]
19907    }
19908
19909
19910.. code-block:: llvm
19911
19912    ; Load elements from A.
19913    %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
19914    ; Store all selected elements consecutively in array B
19915    call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
19916
19917    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
19918    %MaskI = bitcast <8 x i1> %Mask to i8
19919    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
19920    %MaskI64 = zext i8 %MaskIPopcnt to i64
19921    %BNextInd = add i64 %BInd, %MaskI64
19922
19923
19924Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
19925
19926
19927Memory Use Markers
19928------------------
19929
19930This class of intrinsics provides information about the
19931:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
19932are immutable.
19933
19934.. _int_lifestart:
19935
19936'``llvm.lifetime.start``' Intrinsic
19937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19938
19939Syntax:
19940"""""""
19941
19942::
19943
19944      declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
19945
19946Overview:
19947"""""""""
19948
19949The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
19950object's lifetime.
19951
19952Arguments:
19953""""""""""
19954
19955The first argument is a constant integer representing the size of the
19956object, or -1 if it is variable sized. The second argument is a pointer
19957to the object.
19958
19959Semantics:
19960""""""""""
19961
19962If ``ptr`` is a stack-allocated object and it points to the first byte of
19963the object, the object is initially marked as dead.
19964``ptr`` is conservatively considered as a non-stack-allocated object if
19965the stack coloring algorithm that is used in the optimization pipeline cannot
19966conclude that ``ptr`` is a stack-allocated object.
19967
19968After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
19969as alive and has an uninitialized value.
19970The stack object is marked as dead when either
19971:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
19972function returns.
19973
19974After :ref:`llvm.lifetime.end <int_lifeend>` is called,
19975'``llvm.lifetime.start``' on the stack object can be called again.
19976The second '``llvm.lifetime.start``' call marks the object as alive, but it
19977does not change the address of the object.
19978
19979If ``ptr`` is a non-stack-allocated object, it does not point to the first
19980byte of the object or it is a stack object that is already alive, it simply
19981fills all bytes of the object with ``poison``.
19982
19983
19984.. _int_lifeend:
19985
19986'``llvm.lifetime.end``' Intrinsic
19987^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19988
19989Syntax:
19990"""""""
19991
19992::
19993
19994      declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
19995
19996Overview:
19997"""""""""
19998
19999The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
20000lifetime.
20001
20002Arguments:
20003""""""""""
20004
20005The first argument is a constant integer representing the size of the
20006object, or -1 if it is variable sized. The second argument is a pointer
20007to the object.
20008
20009Semantics:
20010""""""""""
20011
20012If ``ptr`` is a stack-allocated object and it points to the first byte of the
20013object, the object is dead.
20014``ptr`` is conservatively considered as a non-stack-allocated object if
20015the stack coloring algorithm that is used in the optimization pipeline cannot
20016conclude that ``ptr`` is a stack-allocated object.
20017
20018Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
20019
20020If ``ptr`` is a non-stack-allocated object or it does not point to the first
20021byte of the object, it is equivalent to simply filling all bytes of the object
20022with ``poison``.
20023
20024
20025'``llvm.invariant.start``' Intrinsic
20026^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20027
20028Syntax:
20029"""""""
20030This is an overloaded intrinsic. The memory object can belong to any address space.
20031
20032::
20033
20034      declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
20035
20036Overview:
20037"""""""""
20038
20039The '``llvm.invariant.start``' intrinsic specifies that the contents of
20040a memory object will not change.
20041
20042Arguments:
20043""""""""""
20044
20045The first argument is a constant integer representing the size of the
20046object, or -1 if it is variable sized. The second argument is a pointer
20047to the object.
20048
20049Semantics:
20050""""""""""
20051
20052This intrinsic indicates that until an ``llvm.invariant.end`` that uses
20053the return value, the referenced memory location is constant and
20054unchanging.
20055
20056'``llvm.invariant.end``' Intrinsic
20057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20058
20059Syntax:
20060"""""""
20061This is an overloaded intrinsic. The memory object can belong to any address space.
20062
20063::
20064
20065      declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
20066
20067Overview:
20068"""""""""
20069
20070The '``llvm.invariant.end``' intrinsic specifies that the contents of a
20071memory object are mutable.
20072
20073Arguments:
20074""""""""""
20075
20076The first argument is the matching ``llvm.invariant.start`` intrinsic.
20077The second argument is a constant integer representing the size of the
20078object, or -1 if it is variable sized and the third argument is a
20079pointer to the object.
20080
20081Semantics:
20082""""""""""
20083
20084This intrinsic indicates that the memory is mutable again.
20085
20086'``llvm.launder.invariant.group``' Intrinsic
20087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20088
20089Syntax:
20090"""""""
20091This is an overloaded intrinsic. The memory object can belong to any address
20092space. The returned pointer must belong to the same address space as the
20093argument.
20094
20095::
20096
20097      declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
20098
20099Overview:
20100"""""""""
20101
20102The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
20103established by ``invariant.group`` metadata no longer holds, to obtain a new
20104pointer value that carries fresh invariant group information. It is an
20105experimental intrinsic, which means that its semantics might change in the
20106future.
20107
20108
20109Arguments:
20110""""""""""
20111
20112The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
20113to the memory.
20114
20115Semantics:
20116""""""""""
20117
20118Returns another pointer that aliases its argument but which is considered different
20119for the purposes of ``load``/``store`` ``invariant.group`` metadata.
20120It does not read any accessible memory and the execution can be speculated.
20121
20122'``llvm.strip.invariant.group``' Intrinsic
20123^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20124
20125Syntax:
20126"""""""
20127This is an overloaded intrinsic. The memory object can belong to any address
20128space. The returned pointer must belong to the same address space as the
20129argument.
20130
20131::
20132
20133      declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
20134
20135Overview:
20136"""""""""
20137
20138The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
20139established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
20140value that does not carry the invariant information. It is an experimental
20141intrinsic, which means that its semantics might change in the future.
20142
20143
20144Arguments:
20145""""""""""
20146
20147The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
20148to the memory.
20149
20150Semantics:
20151""""""""""
20152
20153Returns another pointer that aliases its argument but which has no associated
20154``invariant.group`` metadata.
20155It does not read any memory and can be speculated.
20156
20157
20158
20159.. _constrainedfp:
20160
20161Constrained Floating-Point Intrinsics
20162-------------------------------------
20163
20164These intrinsics are used to provide special handling of floating-point
20165operations when specific rounding mode or floating-point exception behavior is
20166required.  By default, LLVM optimization passes assume that the rounding mode is
20167round-to-nearest and that floating-point exceptions will not be monitored.
20168Constrained FP intrinsics are used to support non-default rounding modes and
20169accurately preserve exception behavior without compromising LLVM's ability to
20170optimize FP code when the default behavior is used.
20171
20172If any FP operation in a function is constrained then they all must be
20173constrained. This is required for correct LLVM IR. Optimizations that
20174move code around can create miscompiles if mixing of constrained and normal
20175operations is done. The correct way to mix constrained and less constrained
20176operations is to use the rounding mode and exception handling metadata to
20177mark constrained intrinsics as having LLVM's default behavior.
20178
20179Each of these intrinsics corresponds to a normal floating-point operation. The
20180data arguments and the return value are the same as the corresponding FP
20181operation.
20182
20183The rounding mode argument is a metadata string specifying what
20184assumptions, if any, the optimizer can make when transforming constant
20185values. Some constrained FP intrinsics omit this argument. If required
20186by the intrinsic, this argument must be one of the following strings:
20187
20188::
20189
20190      "round.dynamic"
20191      "round.tonearest"
20192      "round.downward"
20193      "round.upward"
20194      "round.towardzero"
20195      "round.tonearestaway"
20196
20197If this argument is "round.dynamic" optimization passes must assume that the
20198rounding mode is unknown and may change at runtime.  No transformations that
20199depend on rounding mode may be performed in this case.
20200
20201The other possible values for the rounding mode argument correspond to the
20202similarly named IEEE rounding modes.  If the argument is any of these values
20203optimization passes may perform transformations as long as they are consistent
20204with the specified rounding mode.
20205
20206For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
20207"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
20208'x-0' should evaluate to '-0' when rounding downward.  However, this
20209transformation is legal for all other rounding modes.
20210
20211For values other than "round.dynamic" optimization passes may assume that the
20212actual runtime rounding mode (as defined in a target-specific manner) matches
20213the specified rounding mode, but this is not guaranteed.  Using a specific
20214non-dynamic rounding mode which does not match the actual rounding mode at
20215runtime results in undefined behavior.
20216
20217The exception behavior argument is a metadata string describing the floating
20218point exception semantics that required for the intrinsic. This argument
20219must be one of the following strings:
20220
20221::
20222
20223      "fpexcept.ignore"
20224      "fpexcept.maytrap"
20225      "fpexcept.strict"
20226
20227If this argument is "fpexcept.ignore" optimization passes may assume that the
20228exception status flags will not be read and that floating-point exceptions will
20229be masked.  This allows transformations to be performed that may change the
20230exception semantics of the original code.  For example, FP operations may be
20231speculatively executed in this case whereas they must not be for either of the
20232other possible values of this argument.
20233
20234If the exception behavior argument is "fpexcept.maytrap" optimization passes
20235must avoid transformations that may raise exceptions that would not have been
20236raised by the original code (such as speculatively executing FP operations), but
20237passes are not required to preserve all exceptions that are implied by the
20238original code.  For example, exceptions may be potentially hidden by constant
20239folding.
20240
20241If the exception behavior argument is "fpexcept.strict" all transformations must
20242strictly preserve the floating-point exception semantics of the original code.
20243Any FP exception that would have been raised by the original code must be raised
20244by the transformed code, and the transformed code must not raise any FP
20245exceptions that would not have been raised by the original code.  This is the
20246exception behavior argument that will be used if the code being compiled reads
20247the FP exception status flags, but this mode can also be used with code that
20248unmasks FP exceptions.
20249
20250The number and order of floating-point exceptions is NOT guaranteed.  For
20251example, a series of FP operations that each may raise exceptions may be
20252vectorized into a single instruction that raises each unique exception a single
20253time.
20254
20255Proper :ref:`function attributes <fnattrs>` usage is required for the
20256constrained intrinsics to function correctly.
20257
20258All function *calls* done in a function that uses constrained floating
20259point intrinsics must have the ``strictfp`` attribute.
20260
20261All function *definitions* that use constrained floating point intrinsics
20262must have the ``strictfp`` attribute.
20263
20264'``llvm.experimental.constrained.fadd``' Intrinsic
20265^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20266
20267Syntax:
20268"""""""
20269
20270::
20271
20272      declare <type>
20273      @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
20274                                          metadata <rounding mode>,
20275                                          metadata <exception behavior>)
20276
20277Overview:
20278"""""""""
20279
20280The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
20281two operands.
20282
20283
20284Arguments:
20285""""""""""
20286
20287The first two arguments to the '``llvm.experimental.constrained.fadd``'
20288intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20289of floating-point values. Both arguments must have identical types.
20290
20291The third and fourth arguments specify the rounding mode and exception
20292behavior as described above.
20293
20294Semantics:
20295""""""""""
20296
20297The value produced is the floating-point sum of the two value operands and has
20298the same type as the operands.
20299
20300
20301'``llvm.experimental.constrained.fsub``' Intrinsic
20302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20303
20304Syntax:
20305"""""""
20306
20307::
20308
20309      declare <type>
20310      @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
20311                                          metadata <rounding mode>,
20312                                          metadata <exception behavior>)
20313
20314Overview:
20315"""""""""
20316
20317The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
20318of its two operands.
20319
20320
20321Arguments:
20322""""""""""
20323
20324The first two arguments to the '``llvm.experimental.constrained.fsub``'
20325intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20326of floating-point values. Both arguments must have identical types.
20327
20328The third and fourth arguments specify the rounding mode and exception
20329behavior as described above.
20330
20331Semantics:
20332""""""""""
20333
20334The value produced is the floating-point difference of the two value operands
20335and has the same type as the operands.
20336
20337
20338'``llvm.experimental.constrained.fmul``' Intrinsic
20339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20340
20341Syntax:
20342"""""""
20343
20344::
20345
20346      declare <type>
20347      @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
20348                                          metadata <rounding mode>,
20349                                          metadata <exception behavior>)
20350
20351Overview:
20352"""""""""
20353
20354The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
20355its two operands.
20356
20357
20358Arguments:
20359""""""""""
20360
20361The first two arguments to the '``llvm.experimental.constrained.fmul``'
20362intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20363of floating-point values. Both arguments must have identical types.
20364
20365The third and fourth arguments specify the rounding mode and exception
20366behavior as described above.
20367
20368Semantics:
20369""""""""""
20370
20371The value produced is the floating-point product of the two value operands and
20372has the same type as the operands.
20373
20374
20375'``llvm.experimental.constrained.fdiv``' Intrinsic
20376^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20377
20378Syntax:
20379"""""""
20380
20381::
20382
20383      declare <type>
20384      @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
20385                                          metadata <rounding mode>,
20386                                          metadata <exception behavior>)
20387
20388Overview:
20389"""""""""
20390
20391The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
20392its two operands.
20393
20394
20395Arguments:
20396""""""""""
20397
20398The first two arguments to the '``llvm.experimental.constrained.fdiv``'
20399intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20400of floating-point values. Both arguments must have identical types.
20401
20402The third and fourth arguments specify the rounding mode and exception
20403behavior as described above.
20404
20405Semantics:
20406""""""""""
20407
20408The value produced is the floating-point quotient of the two value operands and
20409has the same type as the operands.
20410
20411
20412'``llvm.experimental.constrained.frem``' Intrinsic
20413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20414
20415Syntax:
20416"""""""
20417
20418::
20419
20420      declare <type>
20421      @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
20422                                          metadata <rounding mode>,
20423                                          metadata <exception behavior>)
20424
20425Overview:
20426"""""""""
20427
20428The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
20429from the division of its two operands.
20430
20431
20432Arguments:
20433""""""""""
20434
20435The first two arguments to the '``llvm.experimental.constrained.frem``'
20436intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20437of floating-point values. Both arguments must have identical types.
20438
20439The third and fourth arguments specify the rounding mode and exception
20440behavior as described above.  The rounding mode argument has no effect, since
20441the result of frem is never rounded, but the argument is included for
20442consistency with the other constrained floating-point intrinsics.
20443
20444Semantics:
20445""""""""""
20446
20447The value produced is the floating-point remainder from the division of the two
20448value operands and has the same type as the operands.  The remainder has the
20449same sign as the dividend.
20450
20451'``llvm.experimental.constrained.fma``' Intrinsic
20452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20453
20454Syntax:
20455"""""""
20456
20457::
20458
20459      declare <type>
20460      @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
20461                                          metadata <rounding mode>,
20462                                          metadata <exception behavior>)
20463
20464Overview:
20465"""""""""
20466
20467The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
20468fused-multiply-add operation on its operands.
20469
20470Arguments:
20471""""""""""
20472
20473The first three arguments to the '``llvm.experimental.constrained.fma``'
20474intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
20475<t_vector>` of floating-point values. All arguments must have identical types.
20476
20477The fourth and fifth arguments specify the rounding mode and exception behavior
20478as described above.
20479
20480Semantics:
20481""""""""""
20482
20483The result produced is the product of the first two operands added to the third
20484operand computed with infinite precision, and then rounded to the target
20485precision.
20486
20487'``llvm.experimental.constrained.fptoui``' Intrinsic
20488^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20489
20490Syntax:
20491"""""""
20492
20493::
20494
20495      declare <ty2>
20496      @llvm.experimental.constrained.fptoui(<type> <value>,
20497                                          metadata <exception behavior>)
20498
20499Overview:
20500"""""""""
20501
20502The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
20503floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
20504
20505Arguments:
20506""""""""""
20507
20508The first argument to the '``llvm.experimental.constrained.fptoui``'
20509intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20510<t_vector>` of floating point values.
20511
20512The second argument specifies the exception behavior as described above.
20513
20514Semantics:
20515""""""""""
20516
20517The result produced is an unsigned integer converted from the floating
20518point operand. The value is truncated, so it is rounded towards zero.
20519
20520'``llvm.experimental.constrained.fptosi``' Intrinsic
20521^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20522
20523Syntax:
20524"""""""
20525
20526::
20527
20528      declare <ty2>
20529      @llvm.experimental.constrained.fptosi(<type> <value>,
20530                                          metadata <exception behavior>)
20531
20532Overview:
20533"""""""""
20534
20535The '``llvm.experimental.constrained.fptosi``' intrinsic converts
20536:ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
20537
20538Arguments:
20539""""""""""
20540
20541The first argument to the '``llvm.experimental.constrained.fptosi``'
20542intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20543<t_vector>` of floating point values.
20544
20545The second argument specifies the exception behavior as described above.
20546
20547Semantics:
20548""""""""""
20549
20550The result produced is a signed integer converted from the floating
20551point operand. The value is truncated, so it is rounded towards zero.
20552
20553'``llvm.experimental.constrained.uitofp``' Intrinsic
20554^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20555
20556Syntax:
20557"""""""
20558
20559::
20560
20561      declare <ty2>
20562      @llvm.experimental.constrained.uitofp(<type> <value>,
20563                                          metadata <rounding mode>,
20564                                          metadata <exception behavior>)
20565
20566Overview:
20567"""""""""
20568
20569The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
20570unsigned integer ``value`` to a floating-point of type ``ty2``.
20571
20572Arguments:
20573""""""""""
20574
20575The first argument to the '``llvm.experimental.constrained.uitofp``'
20576intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20577<t_vector>` of integer values.
20578
20579The second and third arguments specify the rounding mode and exception
20580behavior as described above.
20581
20582Semantics:
20583""""""""""
20584
20585An inexact floating-point exception will be raised if rounding is required.
20586Any result produced is a floating point value converted from the input
20587integer operand.
20588
20589'``llvm.experimental.constrained.sitofp``' Intrinsic
20590^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20591
20592Syntax:
20593"""""""
20594
20595::
20596
20597      declare <ty2>
20598      @llvm.experimental.constrained.sitofp(<type> <value>,
20599                                          metadata <rounding mode>,
20600                                          metadata <exception behavior>)
20601
20602Overview:
20603"""""""""
20604
20605The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
20606signed integer ``value`` to a floating-point of type ``ty2``.
20607
20608Arguments:
20609""""""""""
20610
20611The first argument to the '``llvm.experimental.constrained.sitofp``'
20612intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20613<t_vector>` of integer values.
20614
20615The second and third arguments specify the rounding mode and exception
20616behavior as described above.
20617
20618Semantics:
20619""""""""""
20620
20621An inexact floating-point exception will be raised if rounding is required.
20622Any result produced is a floating point value converted from the input
20623integer operand.
20624
20625'``llvm.experimental.constrained.fptrunc``' Intrinsic
20626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20627
20628Syntax:
20629"""""""
20630
20631::
20632
20633      declare <ty2>
20634      @llvm.experimental.constrained.fptrunc(<type> <value>,
20635                                          metadata <rounding mode>,
20636                                          metadata <exception behavior>)
20637
20638Overview:
20639"""""""""
20640
20641The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
20642to type ``ty2``.
20643
20644Arguments:
20645""""""""""
20646
20647The first argument to the '``llvm.experimental.constrained.fptrunc``'
20648intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20649<t_vector>` of floating point values. This argument must be larger in size
20650than the result.
20651
20652The second and third arguments specify the rounding mode and exception
20653behavior as described above.
20654
20655Semantics:
20656""""""""""
20657
20658The result produced is a floating point value truncated to be smaller in size
20659than the operand.
20660
20661'``llvm.experimental.constrained.fpext``' Intrinsic
20662^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20663
20664Syntax:
20665"""""""
20666
20667::
20668
20669      declare <ty2>
20670      @llvm.experimental.constrained.fpext(<type> <value>,
20671                                          metadata <exception behavior>)
20672
20673Overview:
20674"""""""""
20675
20676The '``llvm.experimental.constrained.fpext``' intrinsic extends a
20677floating-point ``value`` to a larger floating-point value.
20678
20679Arguments:
20680""""""""""
20681
20682The first argument to the '``llvm.experimental.constrained.fpext``'
20683intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20684<t_vector>` of floating point values. This argument must be smaller in size
20685than the result.
20686
20687The second argument specifies the exception behavior as described above.
20688
20689Semantics:
20690""""""""""
20691
20692The result produced is a floating point value extended to be larger in size
20693than the operand. All restrictions that apply to the fpext instruction also
20694apply to this intrinsic.
20695
20696'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
20697^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20698
20699Syntax:
20700"""""""
20701
20702::
20703
20704      declare <ty2>
20705      @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
20706                                          metadata <condition code>,
20707                                          metadata <exception behavior>)
20708      declare <ty2>
20709      @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
20710                                           metadata <condition code>,
20711                                           metadata <exception behavior>)
20712
20713Overview:
20714"""""""""
20715
20716The '``llvm.experimental.constrained.fcmp``' and
20717'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
20718value or vector of boolean values based on comparison of its operands.
20719
20720If the operands are floating-point scalars, then the result type is a
20721boolean (:ref:`i1 <t_integer>`).
20722
20723If the operands are floating-point vectors, then the result type is a
20724vector of boolean with the same number of elements as the operands being
20725compared.
20726
20727The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
20728comparison operation while the '``llvm.experimental.constrained.fcmps``'
20729intrinsic performs a signaling comparison operation.
20730
20731Arguments:
20732""""""""""
20733
20734The first two arguments to the '``llvm.experimental.constrained.fcmp``'
20735and '``llvm.experimental.constrained.fcmps``' intrinsics must be
20736:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20737of floating-point values. Both arguments must have identical types.
20738
20739The third argument is the condition code indicating the kind of comparison
20740to perform. It must be a metadata string with one of the following values:
20741
20742- "``oeq``": ordered and equal
20743- "``ogt``": ordered and greater than
20744- "``oge``": ordered and greater than or equal
20745- "``olt``": ordered and less than
20746- "``ole``": ordered and less than or equal
20747- "``one``": ordered and not equal
20748- "``ord``": ordered (no nans)
20749- "``ueq``": unordered or equal
20750- "``ugt``": unordered or greater than
20751- "``uge``": unordered or greater than or equal
20752- "``ult``": unordered or less than
20753- "``ule``": unordered or less than or equal
20754- "``une``": unordered or not equal
20755- "``uno``": unordered (either nans)
20756
20757*Ordered* means that neither operand is a NAN while *unordered* means
20758that either operand may be a NAN.
20759
20760The fourth argument specifies the exception behavior as described above.
20761
20762Semantics:
20763""""""""""
20764
20765``op1`` and ``op2`` are compared according to the condition code given
20766as the third argument. If the operands are vectors, then the
20767vectors are compared element by element. Each comparison performed
20768always yields an :ref:`i1 <t_integer>` result, as follows:
20769
20770- "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
20771  is equal to ``op2``.
20772- "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
20773  is greater than ``op2``.
20774- "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
20775  is greater than or equal to ``op2``.
20776- "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
20777  is less than ``op2``.
20778- "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
20779  is less than or equal to ``op2``.
20780- "``one``": yields ``true`` if both operands are not a NAN and ``op1``
20781  is not equal to ``op2``.
20782- "``ord``": yields ``true`` if both operands are not a NAN.
20783- "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
20784  equal to ``op2``.
20785- "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
20786  greater than ``op2``.
20787- "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
20788  greater than or equal to ``op2``.
20789- "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
20790  less than ``op2``.
20791- "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
20792  less than or equal to ``op2``.
20793- "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
20794  not equal to ``op2``.
20795- "``uno``": yields ``true`` if either operand is a NAN.
20796
20797The quiet comparison operation performed by
20798'``llvm.experimental.constrained.fcmp``' will only raise an exception
20799if either operand is a SNAN.  The signaling comparison operation
20800performed by '``llvm.experimental.constrained.fcmps``' will raise an
20801exception if either operand is a NAN (QNAN or SNAN). Such an exception
20802does not preclude a result being produced (e.g. exception might only
20803set a flag), therefore the distinction between ordered and unordered
20804comparisons is also relevant for the
20805'``llvm.experimental.constrained.fcmps``' intrinsic.
20806
20807'``llvm.experimental.constrained.fmuladd``' Intrinsic
20808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20809
20810Syntax:
20811"""""""
20812
20813::
20814
20815      declare <type>
20816      @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
20817                                             <type> <op3>,
20818                                             metadata <rounding mode>,
20819                                             metadata <exception behavior>)
20820
20821Overview:
20822"""""""""
20823
20824The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
20825multiply-add expressions that can be fused if the code generator determines
20826that (a) the target instruction set has support for a fused operation,
20827and (b) that the fused operation is more efficient than the equivalent,
20828separate pair of mul and add instructions.
20829
20830Arguments:
20831""""""""""
20832
20833The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
20834intrinsic must be floating-point or vector of floating-point values.
20835All three arguments must have identical types.
20836
20837The fourth and fifth arguments specify the rounding mode and exception behavior
20838as described above.
20839
20840Semantics:
20841""""""""""
20842
20843The expression:
20844
20845::
20846
20847      %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
20848                                                                 metadata <rounding mode>,
20849                                                                 metadata <exception behavior>)
20850
20851is equivalent to the expression:
20852
20853::
20854
20855      %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
20856                                                              metadata <rounding mode>,
20857                                                              metadata <exception behavior>)
20858      %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
20859                                                              metadata <rounding mode>,
20860                                                              metadata <exception behavior>)
20861
20862except that it is unspecified whether rounding will be performed between the
20863multiplication and addition steps. Fusion is not guaranteed, even if the target
20864platform supports it.
20865If a fused multiply-add is required, the corresponding
20866:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
20867used instead.
20868This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
20869
20870Constrained libm-equivalent Intrinsics
20871--------------------------------------
20872
20873In addition to the basic floating-point operations for which constrained
20874intrinsics are described above, there are constrained versions of various
20875operations which provide equivalent behavior to a corresponding libm function.
20876These intrinsics allow the precise behavior of these operations with respect to
20877rounding mode and exception behavior to be controlled.
20878
20879As with the basic constrained floating-point intrinsics, the rounding mode
20880and exception behavior arguments only control the behavior of the optimizer.
20881They do not change the runtime floating-point environment.
20882
20883
20884'``llvm.experimental.constrained.sqrt``' Intrinsic
20885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20886
20887Syntax:
20888"""""""
20889
20890::
20891
20892      declare <type>
20893      @llvm.experimental.constrained.sqrt(<type> <op1>,
20894                                          metadata <rounding mode>,
20895                                          metadata <exception behavior>)
20896
20897Overview:
20898"""""""""
20899
20900The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
20901of the specified value, returning the same value as the libm '``sqrt``'
20902functions would, but without setting ``errno``.
20903
20904Arguments:
20905""""""""""
20906
20907The first argument and the return type are floating-point numbers of the same
20908type.
20909
20910The second and third arguments specify the rounding mode and exception
20911behavior as described above.
20912
20913Semantics:
20914""""""""""
20915
20916This function returns the nonnegative square root of the specified value.
20917If the value is less than negative zero, a floating-point exception occurs
20918and the return value is architecture specific.
20919
20920
20921'``llvm.experimental.constrained.pow``' Intrinsic
20922^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20923
20924Syntax:
20925"""""""
20926
20927::
20928
20929      declare <type>
20930      @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
20931                                         metadata <rounding mode>,
20932                                         metadata <exception behavior>)
20933
20934Overview:
20935"""""""""
20936
20937The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
20938raised to the (positive or negative) power specified by the second operand.
20939
20940Arguments:
20941""""""""""
20942
20943The first two arguments and the return value are floating-point numbers of the
20944same type.  The second argument specifies the power to which the first argument
20945should be raised.
20946
20947The third and fourth arguments specify the rounding mode and exception
20948behavior as described above.
20949
20950Semantics:
20951""""""""""
20952
20953This function returns the first value raised to the second power,
20954returning the same values as the libm ``pow`` functions would, and
20955handles error conditions in the same way.
20956
20957
20958'``llvm.experimental.constrained.powi``' Intrinsic
20959^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20960
20961Syntax:
20962"""""""
20963
20964::
20965
20966      declare <type>
20967      @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
20968                                          metadata <rounding mode>,
20969                                          metadata <exception behavior>)
20970
20971Overview:
20972"""""""""
20973
20974The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
20975raised to the (positive or negative) power specified by the second operand. The
20976order of evaluation of multiplications is not defined. When a vector of
20977floating-point type is used, the second argument remains a scalar integer value.
20978
20979
20980Arguments:
20981""""""""""
20982
20983The first argument and the return value are floating-point numbers of the same
20984type.  The second argument is a 32-bit signed integer specifying the power to
20985which the first argument should be raised.
20986
20987The third and fourth arguments specify the rounding mode and exception
20988behavior as described above.
20989
20990Semantics:
20991""""""""""
20992
20993This function returns the first value raised to the second power with an
20994unspecified sequence of rounding operations.
20995
20996
20997'``llvm.experimental.constrained.sin``' Intrinsic
20998^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20999
21000Syntax:
21001"""""""
21002
21003::
21004
21005      declare <type>
21006      @llvm.experimental.constrained.sin(<type> <op1>,
21007                                         metadata <rounding mode>,
21008                                         metadata <exception behavior>)
21009
21010Overview:
21011"""""""""
21012
21013The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
21014first operand.
21015
21016Arguments:
21017""""""""""
21018
21019The first argument and the return type are floating-point numbers of the same
21020type.
21021
21022The second and third arguments specify the rounding mode and exception
21023behavior as described above.
21024
21025Semantics:
21026""""""""""
21027
21028This function returns the sine of the specified operand, returning the
21029same values as the libm ``sin`` functions would, and handles error
21030conditions in the same way.
21031
21032
21033'``llvm.experimental.constrained.cos``' Intrinsic
21034^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21035
21036Syntax:
21037"""""""
21038
21039::
21040
21041      declare <type>
21042      @llvm.experimental.constrained.cos(<type> <op1>,
21043                                         metadata <rounding mode>,
21044                                         metadata <exception behavior>)
21045
21046Overview:
21047"""""""""
21048
21049The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
21050first operand.
21051
21052Arguments:
21053""""""""""
21054
21055The first argument and the return type are floating-point numbers of the same
21056type.
21057
21058The second and third arguments specify the rounding mode and exception
21059behavior as described above.
21060
21061Semantics:
21062""""""""""
21063
21064This function returns the cosine of the specified operand, returning the
21065same values as the libm ``cos`` functions would, and handles error
21066conditions in the same way.
21067
21068
21069'``llvm.experimental.constrained.exp``' Intrinsic
21070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21071
21072Syntax:
21073"""""""
21074
21075::
21076
21077      declare <type>
21078      @llvm.experimental.constrained.exp(<type> <op1>,
21079                                         metadata <rounding mode>,
21080                                         metadata <exception behavior>)
21081
21082Overview:
21083"""""""""
21084
21085The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
21086exponential of the specified value.
21087
21088Arguments:
21089""""""""""
21090
21091The first argument and the return value are floating-point numbers of the same
21092type.
21093
21094The second and third arguments specify the rounding mode and exception
21095behavior as described above.
21096
21097Semantics:
21098""""""""""
21099
21100This function returns the same values as the libm ``exp`` functions
21101would, and handles error conditions in the same way.
21102
21103
21104'``llvm.experimental.constrained.exp2``' Intrinsic
21105^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21106
21107Syntax:
21108"""""""
21109
21110::
21111
21112      declare <type>
21113      @llvm.experimental.constrained.exp2(<type> <op1>,
21114                                          metadata <rounding mode>,
21115                                          metadata <exception behavior>)
21116
21117Overview:
21118"""""""""
21119
21120The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
21121exponential of the specified value.
21122
21123
21124Arguments:
21125""""""""""
21126
21127The first argument and the return value are floating-point numbers of the same
21128type.
21129
21130The second and third arguments specify the rounding mode and exception
21131behavior as described above.
21132
21133Semantics:
21134""""""""""
21135
21136This function returns the same values as the libm ``exp2`` functions
21137would, and handles error conditions in the same way.
21138
21139
21140'``llvm.experimental.constrained.log``' Intrinsic
21141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21142
21143Syntax:
21144"""""""
21145
21146::
21147
21148      declare <type>
21149      @llvm.experimental.constrained.log(<type> <op1>,
21150                                         metadata <rounding mode>,
21151                                         metadata <exception behavior>)
21152
21153Overview:
21154"""""""""
21155
21156The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
21157logarithm of the specified value.
21158
21159Arguments:
21160""""""""""
21161
21162The first argument and the return value are floating-point numbers of the same
21163type.
21164
21165The second and third arguments specify the rounding mode and exception
21166behavior as described above.
21167
21168
21169Semantics:
21170""""""""""
21171
21172This function returns the same values as the libm ``log`` functions
21173would, and handles error conditions in the same way.
21174
21175
21176'``llvm.experimental.constrained.log10``' Intrinsic
21177^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21178
21179Syntax:
21180"""""""
21181
21182::
21183
21184      declare <type>
21185      @llvm.experimental.constrained.log10(<type> <op1>,
21186                                           metadata <rounding mode>,
21187                                           metadata <exception behavior>)
21188
21189Overview:
21190"""""""""
21191
21192The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
21193logarithm of the specified value.
21194
21195Arguments:
21196""""""""""
21197
21198The first argument and the return value are floating-point numbers of the same
21199type.
21200
21201The second and third arguments specify the rounding mode and exception
21202behavior as described above.
21203
21204Semantics:
21205""""""""""
21206
21207This function returns the same values as the libm ``log10`` functions
21208would, and handles error conditions in the same way.
21209
21210
21211'``llvm.experimental.constrained.log2``' Intrinsic
21212^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21213
21214Syntax:
21215"""""""
21216
21217::
21218
21219      declare <type>
21220      @llvm.experimental.constrained.log2(<type> <op1>,
21221                                          metadata <rounding mode>,
21222                                          metadata <exception behavior>)
21223
21224Overview:
21225"""""""""
21226
21227The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
21228logarithm of the specified value.
21229
21230Arguments:
21231""""""""""
21232
21233The first argument and the return value are floating-point numbers of the same
21234type.
21235
21236The second and third arguments specify the rounding mode and exception
21237behavior as described above.
21238
21239Semantics:
21240""""""""""
21241
21242This function returns the same values as the libm ``log2`` functions
21243would, and handles error conditions in the same way.
21244
21245
21246'``llvm.experimental.constrained.rint``' Intrinsic
21247^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21248
21249Syntax:
21250"""""""
21251
21252::
21253
21254      declare <type>
21255      @llvm.experimental.constrained.rint(<type> <op1>,
21256                                          metadata <rounding mode>,
21257                                          metadata <exception behavior>)
21258
21259Overview:
21260"""""""""
21261
21262The '``llvm.experimental.constrained.rint``' intrinsic returns the first
21263operand rounded to the nearest integer. It may raise an inexact floating-point
21264exception if the operand is not an integer.
21265
21266Arguments:
21267""""""""""
21268
21269The first argument and the return value are floating-point numbers of the same
21270type.
21271
21272The second and third arguments specify the rounding mode and exception
21273behavior as described above.
21274
21275Semantics:
21276""""""""""
21277
21278This function returns the same values as the libm ``rint`` functions
21279would, and handles error conditions in the same way.  The rounding mode is
21280described, not determined, by the rounding mode argument.  The actual rounding
21281mode is determined by the runtime floating-point environment.  The rounding
21282mode argument is only intended as information to the compiler.
21283
21284
21285'``llvm.experimental.constrained.lrint``' Intrinsic
21286^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21287
21288Syntax:
21289"""""""
21290
21291::
21292
21293      declare <inttype>
21294      @llvm.experimental.constrained.lrint(<fptype> <op1>,
21295                                           metadata <rounding mode>,
21296                                           metadata <exception behavior>)
21297
21298Overview:
21299"""""""""
21300
21301The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
21302operand rounded to the nearest integer. An inexact floating-point exception
21303will be raised if the operand is not an integer. An invalid exception is
21304raised if the result is too large to fit into a supported integer type,
21305and in this case the result is undefined.
21306
21307Arguments:
21308""""""""""
21309
21310The first argument is a floating-point number. The return value is an
21311integer type. Not all types are supported on all targets. The supported
21312types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
21313libm functions.
21314
21315The second and third arguments specify the rounding mode and exception
21316behavior as described above.
21317
21318Semantics:
21319""""""""""
21320
21321This function returns the same values as the libm ``lrint`` functions
21322would, and handles error conditions in the same way.
21323
21324The rounding mode is described, not determined, by the rounding mode
21325argument.  The actual rounding mode is determined by the runtime floating-point
21326environment.  The rounding mode argument is only intended as information
21327to the compiler.
21328
21329If the runtime floating-point environment is using the default rounding mode
21330then the results will be the same as the llvm.lrint intrinsic.
21331
21332
21333'``llvm.experimental.constrained.llrint``' Intrinsic
21334^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21335
21336Syntax:
21337"""""""
21338
21339::
21340
21341      declare <inttype>
21342      @llvm.experimental.constrained.llrint(<fptype> <op1>,
21343                                            metadata <rounding mode>,
21344                                            metadata <exception behavior>)
21345
21346Overview:
21347"""""""""
21348
21349The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
21350operand rounded to the nearest integer. An inexact floating-point exception
21351will be raised if the operand is not an integer. An invalid exception is
21352raised if the result is too large to fit into a supported integer type,
21353and in this case the result is undefined.
21354
21355Arguments:
21356""""""""""
21357
21358The first argument is a floating-point number. The return value is an
21359integer type. Not all types are supported on all targets. The supported
21360types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
21361libm functions.
21362
21363The second and third arguments specify the rounding mode and exception
21364behavior as described above.
21365
21366Semantics:
21367""""""""""
21368
21369This function returns the same values as the libm ``llrint`` functions
21370would, and handles error conditions in the same way.
21371
21372The rounding mode is described, not determined, by the rounding mode
21373argument.  The actual rounding mode is determined by the runtime floating-point
21374environment.  The rounding mode argument is only intended as information
21375to the compiler.
21376
21377If the runtime floating-point environment is using the default rounding mode
21378then the results will be the same as the llvm.llrint intrinsic.
21379
21380
21381'``llvm.experimental.constrained.nearbyint``' Intrinsic
21382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21383
21384Syntax:
21385"""""""
21386
21387::
21388
21389      declare <type>
21390      @llvm.experimental.constrained.nearbyint(<type> <op1>,
21391                                               metadata <rounding mode>,
21392                                               metadata <exception behavior>)
21393
21394Overview:
21395"""""""""
21396
21397The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
21398operand rounded to the nearest integer. It will not raise an inexact
21399floating-point exception if the operand is not an integer.
21400
21401
21402Arguments:
21403""""""""""
21404
21405The first argument and the return value are floating-point numbers of the same
21406type.
21407
21408The second and third arguments specify the rounding mode and exception
21409behavior as described above.
21410
21411Semantics:
21412""""""""""
21413
21414This function returns the same values as the libm ``nearbyint`` functions
21415would, and handles error conditions in the same way.  The rounding mode is
21416described, not determined, by the rounding mode argument.  The actual rounding
21417mode is determined by the runtime floating-point environment.  The rounding
21418mode argument is only intended as information to the compiler.
21419
21420
21421'``llvm.experimental.constrained.maxnum``' Intrinsic
21422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21423
21424Syntax:
21425"""""""
21426
21427::
21428
21429      declare <type>
21430      @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
21431                                            metadata <exception behavior>)
21432
21433Overview:
21434"""""""""
21435
21436The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
21437of the two arguments.
21438
21439Arguments:
21440""""""""""
21441
21442The first two arguments and the return value are floating-point numbers
21443of the same type.
21444
21445The third argument specifies the exception behavior as described above.
21446
21447Semantics:
21448""""""""""
21449
21450This function follows the IEEE-754 semantics for maxNum.
21451
21452
21453'``llvm.experimental.constrained.minnum``' Intrinsic
21454^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21455
21456Syntax:
21457"""""""
21458
21459::
21460
21461      declare <type>
21462      @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
21463                                            metadata <exception behavior>)
21464
21465Overview:
21466"""""""""
21467
21468The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
21469of the two arguments.
21470
21471Arguments:
21472""""""""""
21473
21474The first two arguments and the return value are floating-point numbers
21475of the same type.
21476
21477The third argument specifies the exception behavior as described above.
21478
21479Semantics:
21480""""""""""
21481
21482This function follows the IEEE-754 semantics for minNum.
21483
21484
21485'``llvm.experimental.constrained.maximum``' Intrinsic
21486^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21487
21488Syntax:
21489"""""""
21490
21491::
21492
21493      declare <type>
21494      @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
21495                                             metadata <exception behavior>)
21496
21497Overview:
21498"""""""""
21499
21500The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
21501of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21502
21503Arguments:
21504""""""""""
21505
21506The first two arguments and the return value are floating-point numbers
21507of the same type.
21508
21509The third argument specifies the exception behavior as described above.
21510
21511Semantics:
21512""""""""""
21513
21514This function follows semantics specified in the draft of IEEE 754-2018.
21515
21516
21517'``llvm.experimental.constrained.minimum``' Intrinsic
21518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21519
21520Syntax:
21521"""""""
21522
21523::
21524
21525      declare <type>
21526      @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
21527                                             metadata <exception behavior>)
21528
21529Overview:
21530"""""""""
21531
21532The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
21533of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21534
21535Arguments:
21536""""""""""
21537
21538The first two arguments and the return value are floating-point numbers
21539of the same type.
21540
21541The third argument specifies the exception behavior as described above.
21542
21543Semantics:
21544""""""""""
21545
21546This function follows semantics specified in the draft of IEEE 754-2018.
21547
21548
21549'``llvm.experimental.constrained.ceil``' Intrinsic
21550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21551
21552Syntax:
21553"""""""
21554
21555::
21556
21557      declare <type>
21558      @llvm.experimental.constrained.ceil(<type> <op1>,
21559                                          metadata <exception behavior>)
21560
21561Overview:
21562"""""""""
21563
21564The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
21565first operand.
21566
21567Arguments:
21568""""""""""
21569
21570The first argument and the return value are floating-point numbers of the same
21571type.
21572
21573The second argument specifies the exception behavior as described above.
21574
21575Semantics:
21576""""""""""
21577
21578This function returns the same values as the libm ``ceil`` functions
21579would and handles error conditions in the same way.
21580
21581
21582'``llvm.experimental.constrained.floor``' Intrinsic
21583^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21584
21585Syntax:
21586"""""""
21587
21588::
21589
21590      declare <type>
21591      @llvm.experimental.constrained.floor(<type> <op1>,
21592                                           metadata <exception behavior>)
21593
21594Overview:
21595"""""""""
21596
21597The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
21598first operand.
21599
21600Arguments:
21601""""""""""
21602
21603The first argument and the return value are floating-point numbers of the same
21604type.
21605
21606The second argument specifies the exception behavior as described above.
21607
21608Semantics:
21609""""""""""
21610
21611This function returns the same values as the libm ``floor`` functions
21612would and handles error conditions in the same way.
21613
21614
21615'``llvm.experimental.constrained.round``' Intrinsic
21616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21617
21618Syntax:
21619"""""""
21620
21621::
21622
21623      declare <type>
21624      @llvm.experimental.constrained.round(<type> <op1>,
21625                                           metadata <exception behavior>)
21626
21627Overview:
21628"""""""""
21629
21630The '``llvm.experimental.constrained.round``' intrinsic returns the first
21631operand rounded to the nearest integer.
21632
21633Arguments:
21634""""""""""
21635
21636The first argument and the return value are floating-point numbers of the same
21637type.
21638
21639The second argument specifies the exception behavior as described above.
21640
21641Semantics:
21642""""""""""
21643
21644This function returns the same values as the libm ``round`` functions
21645would and handles error conditions in the same way.
21646
21647
21648'``llvm.experimental.constrained.roundeven``' Intrinsic
21649^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21650
21651Syntax:
21652"""""""
21653
21654::
21655
21656      declare <type>
21657      @llvm.experimental.constrained.roundeven(<type> <op1>,
21658                                               metadata <exception behavior>)
21659
21660Overview:
21661"""""""""
21662
21663The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
21664operand rounded to the nearest integer in floating-point format, rounding
21665halfway cases to even (that is, to the nearest value that is an even integer),
21666regardless of the current rounding direction.
21667
21668Arguments:
21669""""""""""
21670
21671The first argument and the return value are floating-point numbers of the same
21672type.
21673
21674The second argument specifies the exception behavior as described above.
21675
21676Semantics:
21677""""""""""
21678
21679This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
21680also behaves in the same way as C standard function ``roundeven`` and can signal
21681the invalid operation exception for a SNAN operand.
21682
21683
21684'``llvm.experimental.constrained.lround``' Intrinsic
21685^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21686
21687Syntax:
21688"""""""
21689
21690::
21691
21692      declare <inttype>
21693      @llvm.experimental.constrained.lround(<fptype> <op1>,
21694                                            metadata <exception behavior>)
21695
21696Overview:
21697"""""""""
21698
21699The '``llvm.experimental.constrained.lround``' intrinsic returns the first
21700operand rounded to the nearest integer with ties away from zero.  It will
21701raise an inexact floating-point exception if the operand is not an integer.
21702An invalid exception is raised if the result is too large to fit into a
21703supported integer type, and in this case the result is undefined.
21704
21705Arguments:
21706""""""""""
21707
21708The first argument is a floating-point number. The return value is an
21709integer type. Not all types are supported on all targets. The supported
21710types are the same as the ``llvm.lround`` intrinsic and the ``lround``
21711libm functions.
21712
21713The second argument specifies the exception behavior as described above.
21714
21715Semantics:
21716""""""""""
21717
21718This function returns the same values as the libm ``lround`` functions
21719would and handles error conditions in the same way.
21720
21721
21722'``llvm.experimental.constrained.llround``' Intrinsic
21723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21724
21725Syntax:
21726"""""""
21727
21728::
21729
21730      declare <inttype>
21731      @llvm.experimental.constrained.llround(<fptype> <op1>,
21732                                             metadata <exception behavior>)
21733
21734Overview:
21735"""""""""
21736
21737The '``llvm.experimental.constrained.llround``' intrinsic returns the first
21738operand rounded to the nearest integer with ties away from zero. It will
21739raise an inexact floating-point exception if the operand is not an integer.
21740An invalid exception is raised if the result is too large to fit into a
21741supported integer type, and in this case the result is undefined.
21742
21743Arguments:
21744""""""""""
21745
21746The first argument is a floating-point number. The return value is an
21747integer type. Not all types are supported on all targets. The supported
21748types are the same as the ``llvm.llround`` intrinsic and the ``llround``
21749libm functions.
21750
21751The second argument specifies the exception behavior as described above.
21752
21753Semantics:
21754""""""""""
21755
21756This function returns the same values as the libm ``llround`` functions
21757would and handles error conditions in the same way.
21758
21759
21760'``llvm.experimental.constrained.trunc``' Intrinsic
21761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21762
21763Syntax:
21764"""""""
21765
21766::
21767
21768      declare <type>
21769      @llvm.experimental.constrained.trunc(<type> <op1>,
21770                                           metadata <exception behavior>)
21771
21772Overview:
21773"""""""""
21774
21775The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
21776operand rounded to the nearest integer not larger in magnitude than the
21777operand.
21778
21779Arguments:
21780""""""""""
21781
21782The first argument and the return value are floating-point numbers of the same
21783type.
21784
21785The second argument specifies the exception behavior as described above.
21786
21787Semantics:
21788""""""""""
21789
21790This function returns the same values as the libm ``trunc`` functions
21791would and handles error conditions in the same way.
21792
21793.. _int_experimental_noalias_scope_decl:
21794
21795'``llvm.experimental.noalias.scope.decl``' Intrinsic
21796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21797
21798Syntax:
21799"""""""
21800
21801
21802::
21803
21804      declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
21805
21806Overview:
21807"""""""""
21808
21809The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
21810noalias scope is declared. When the intrinsic is duplicated, a decision must
21811also be made about the scope: depending on the reason of the duplication,
21812the scope might need to be duplicated as well.
21813
21814
21815Arguments:
21816""""""""""
21817
21818The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
21819metadata references. The format is identical to that required for ``noalias``
21820metadata. This list must have exactly one element.
21821
21822Semantics:
21823""""""""""
21824
21825The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
21826noalias scope is declared. When the intrinsic is duplicated, a decision must
21827also be made about the scope: depending on the reason of the duplication,
21828the scope might need to be duplicated as well.
21829
21830For example, when the intrinsic is used inside a loop body, and that loop is
21831unrolled, the associated noalias scope must also be duplicated. Otherwise, the
21832noalias property it signifies would spill across loop iterations, whereas it
21833was only valid within a single iteration.
21834
21835.. code-block:: llvm
21836
21837  ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
21838  ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
21839  ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
21840  declare void @decl_in_loop(i8* %a.base, i8* %b.base) {
21841  entry:
21842    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
21843    br label %loop
21844
21845  loop:
21846    %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]
21847    %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]
21848    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
21849    %val = load i8, i8* %a, !alias.scope !2
21850    store i8 %val, i8* %b, !noalias !2
21851    %a.inc = getelementptr inbounds i8, i8* %a, i64 1
21852    %b.inc = getelementptr inbounds i8, i8* %b, i64 1
21853    %cond = call i1 @cond()
21854    br i1 %cond, label %loop, label %exit
21855
21856  exit:
21857    ret void
21858  }
21859
21860  !0 = !{!0} ; domain
21861  !1 = !{!1, !0} ; scope
21862  !2 = !{!1} ; scope list
21863
21864Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
21865are possible, but one should never dominate another. Violations are pointed out
21866by the verifier as they indicate a problem in either a transformation pass or
21867the input.
21868
21869
21870Floating Point Environment Manipulation intrinsics
21871--------------------------------------------------
21872
21873These functions read or write floating point environment, such as rounding
21874mode or state of floating point exceptions. Altering the floating point
21875environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
21876
21877'``llvm.flt.rounds``' Intrinsic
21878^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21879
21880Syntax:
21881"""""""
21882
21883::
21884
21885      declare i32 @llvm.flt.rounds()
21886
21887Overview:
21888"""""""""
21889
21890The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.
21891
21892Semantics:
21893""""""""""
21894
21895The '``llvm.flt.rounds``' intrinsic returns the current rounding mode.
21896Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
21897specified by C standard:
21898
21899::
21900
21901    0  - toward zero
21902    1  - to nearest, ties to even
21903    2  - toward positive infinity
21904    3  - toward negative infinity
21905    4  - to nearest, ties away from zero
21906
21907Other values may be used to represent additional rounding modes, supported by a
21908target. These values are target-specific.
21909
21910
21911'``llvm.set.rounding``' Intrinsic
21912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21913
21914Syntax:
21915"""""""
21916
21917::
21918
21919      declare void @llvm.set.rounding(i32 <val>)
21920
21921Overview:
21922"""""""""
21923
21924The '``llvm.set.rounding``' intrinsic sets current rounding mode.
21925
21926Arguments:
21927""""""""""
21928
21929The argument is the required rounding mode. Encoding of rounding mode is
21930the same as used by '``llvm.flt.rounds``'.
21931
21932Semantics:
21933""""""""""
21934
21935The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
21936similar to C library function 'fesetround', however this intrinsic does not
21937return any value and uses platform-independent representation of IEEE rounding
21938modes.
21939
21940
21941General Intrinsics
21942------------------
21943
21944This class of intrinsics is designed to be generic and has no specific
21945purpose.
21946
21947'``llvm.var.annotation``' Intrinsic
21948^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21949
21950Syntax:
21951"""""""
21952
21953::
21954
21955      declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
21956
21957Overview:
21958"""""""""
21959
21960The '``llvm.var.annotation``' intrinsic.
21961
21962Arguments:
21963""""""""""
21964
21965The first argument is a pointer to a value, the second is a pointer to a
21966global string, the third is a pointer to a global string which is the
21967source file name, and the last argument is the line number.
21968
21969Semantics:
21970""""""""""
21971
21972This intrinsic allows annotation of local variables with arbitrary
21973strings. This can be useful for special purpose optimizations that want
21974to look for these annotations. These have no other defined use; they are
21975ignored by code generation and optimization.
21976
21977'``llvm.ptr.annotation.*``' Intrinsic
21978^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21979
21980Syntax:
21981"""""""
21982
21983This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
21984pointer to an integer of any width. *NOTE* you must specify an address space for
21985the pointer. The identifier for the default address space is the integer
21986'``0``'.
21987
21988::
21989
21990      declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
21991      declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
21992      declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
21993      declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
21994      declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
21995
21996Overview:
21997"""""""""
21998
21999The '``llvm.ptr.annotation``' intrinsic.
22000
22001Arguments:
22002""""""""""
22003
22004The first argument is a pointer to an integer value of arbitrary bitwidth
22005(result of some expression), the second is a pointer to a global string, the
22006third is a pointer to a global string which is the source file name, and the
22007last argument is the line number. It returns the value of the first argument.
22008
22009Semantics:
22010""""""""""
22011
22012This intrinsic allows annotation of a pointer to an integer with arbitrary
22013strings. This can be useful for special purpose optimizations that want to look
22014for these annotations. These have no other defined use; they are ignored by code
22015generation and optimization.
22016
22017'``llvm.annotation.*``' Intrinsic
22018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22019
22020Syntax:
22021"""""""
22022
22023This is an overloaded intrinsic. You can use '``llvm.annotation``' on
22024any integer bit width.
22025
22026::
22027
22028      declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int>)
22029      declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int>)
22030      declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32  <int>)
22031      declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32  <int>)
22032      declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32  <int>)
22033
22034Overview:
22035"""""""""
22036
22037The '``llvm.annotation``' intrinsic.
22038
22039Arguments:
22040""""""""""
22041
22042The first argument is an integer value (result of some expression), the
22043second is a pointer to a global string, the third is a pointer to a
22044global string which is the source file name, and the last argument is
22045the line number. It returns the value of the first argument.
22046
22047Semantics:
22048""""""""""
22049
22050This intrinsic allows annotations to be put on arbitrary expressions
22051with arbitrary strings. This can be useful for special purpose
22052optimizations that want to look for these annotations. These have no
22053other defined use; they are ignored by code generation and optimization.
22054
22055'``llvm.codeview.annotation``' Intrinsic
22056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22057
22058Syntax:
22059"""""""
22060
22061This annotation emits a label at its program point and an associated
22062``S_ANNOTATION`` codeview record with some additional string metadata. This is
22063used to implement MSVC's ``__annotation`` intrinsic. It is marked
22064``noduplicate``, so calls to this intrinsic prevent inlining and should be
22065considered expensive.
22066
22067::
22068
22069      declare void @llvm.codeview.annotation(metadata)
22070
22071Arguments:
22072""""""""""
22073
22074The argument should be an MDTuple containing any number of MDStrings.
22075
22076'``llvm.trap``' Intrinsic
22077^^^^^^^^^^^^^^^^^^^^^^^^^
22078
22079Syntax:
22080"""""""
22081
22082::
22083
22084      declare void @llvm.trap() cold noreturn nounwind
22085
22086Overview:
22087"""""""""
22088
22089The '``llvm.trap``' intrinsic.
22090
22091Arguments:
22092""""""""""
22093
22094None.
22095
22096Semantics:
22097""""""""""
22098
22099This intrinsic is lowered to the target dependent trap instruction. If
22100the target does not have a trap instruction, this intrinsic will be
22101lowered to a call of the ``abort()`` function.
22102
22103'``llvm.debugtrap``' Intrinsic
22104^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22105
22106Syntax:
22107"""""""
22108
22109::
22110
22111      declare void @llvm.debugtrap() nounwind
22112
22113Overview:
22114"""""""""
22115
22116The '``llvm.debugtrap``' intrinsic.
22117
22118Arguments:
22119""""""""""
22120
22121None.
22122
22123Semantics:
22124""""""""""
22125
22126This intrinsic is lowered to code which is intended to cause an
22127execution trap with the intention of requesting the attention of a
22128debugger.
22129
22130'``llvm.ubsantrap``' Intrinsic
22131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22132
22133Syntax:
22134"""""""
22135
22136::
22137
22138      declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
22139
22140Overview:
22141"""""""""
22142
22143The '``llvm.ubsantrap``' intrinsic.
22144
22145Arguments:
22146""""""""""
22147
22148An integer describing the kind of failure detected.
22149
22150Semantics:
22151""""""""""
22152
22153This intrinsic is lowered to code which is intended to cause an execution trap,
22154embedding the argument into encoding of that trap somehow to discriminate
22155crashes if possible.
22156
22157Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
22158
22159'``llvm.stackprotector``' Intrinsic
22160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22161
22162Syntax:
22163"""""""
22164
22165::
22166
22167      declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
22168
22169Overview:
22170"""""""""
22171
22172The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
22173onto the stack at ``slot``. The stack slot is adjusted to ensure that it
22174is placed on the stack before local variables.
22175
22176Arguments:
22177""""""""""
22178
22179The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
22180The first argument is the value loaded from the stack guard
22181``@__stack_chk_guard``. The second variable is an ``alloca`` that has
22182enough space to hold the value of the guard.
22183
22184Semantics:
22185""""""""""
22186
22187This intrinsic causes the prologue/epilogue inserter to force the position of
22188the ``AllocaInst`` stack slot to be before local variables on the stack. This is
22189to ensure that if a local variable on the stack is overwritten, it will destroy
22190the value of the guard. When the function exits, the guard on the stack is
22191checked against the original guard by ``llvm.stackprotectorcheck``. If they are
22192different, then ``llvm.stackprotectorcheck`` causes the program to abort by
22193calling the ``__stack_chk_fail()`` function.
22194
22195'``llvm.stackguard``' Intrinsic
22196^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22197
22198Syntax:
22199"""""""
22200
22201::
22202
22203      declare i8* @llvm.stackguard()
22204
22205Overview:
22206"""""""""
22207
22208The ``llvm.stackguard`` intrinsic returns the system stack guard value.
22209
22210It should not be generated by frontends, since it is only for internal usage.
22211The reason why we create this intrinsic is that we still support IR form Stack
22212Protector in FastISel.
22213
22214Arguments:
22215""""""""""
22216
22217None.
22218
22219Semantics:
22220""""""""""
22221
22222On some platforms, the value returned by this intrinsic remains unchanged
22223between loads in the same thread. On other platforms, it returns the same
22224global variable value, if any, e.g. ``@__stack_chk_guard``.
22225
22226Currently some platforms have IR-level customized stack guard loading (e.g.
22227X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
22228in the future.
22229
22230'``llvm.objectsize``' Intrinsic
22231^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22232
22233Syntax:
22234"""""""
22235
22236::
22237
22238      declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22239      declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22240
22241Overview:
22242"""""""""
22243
22244The ``llvm.objectsize`` intrinsic is designed to provide information to the
22245optimizer to determine whether a) an operation (like memcpy) will overflow a
22246buffer that corresponds to an object, or b) that a runtime check for overflow
22247isn't necessary. An object in this context means an allocation of a specific
22248class, structure, array, or other object.
22249
22250Arguments:
22251""""""""""
22252
22253The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
22254pointer to or into the ``object``. The second argument determines whether
22255``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
22256unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
22257in address space 0 is used as its pointer argument. If it's ``false``,
22258``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
22259the ``null`` is in a non-zero address space or if ``true`` is given for the
22260third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
22261argument to ``llvm.objectsize`` determines if the value should be evaluated at
22262runtime.
22263
22264The second, third, and fourth arguments only accept constants.
22265
22266Semantics:
22267""""""""""
22268
22269The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
22270the object concerned. If the size cannot be determined, ``llvm.objectsize``
22271returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
22272
22273'``llvm.expect``' Intrinsic
22274^^^^^^^^^^^^^^^^^^^^^^^^^^^
22275
22276Syntax:
22277"""""""
22278
22279This is an overloaded intrinsic. You can use ``llvm.expect`` on any
22280integer bit width.
22281
22282::
22283
22284      declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
22285      declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
22286      declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
22287
22288Overview:
22289"""""""""
22290
22291The ``llvm.expect`` intrinsic provides information about expected (the
22292most probable) value of ``val``, which can be used by optimizers.
22293
22294Arguments:
22295""""""""""
22296
22297The ``llvm.expect`` intrinsic takes two arguments. The first argument is
22298a value. The second argument is an expected value.
22299
22300Semantics:
22301""""""""""
22302
22303This intrinsic is lowered to the ``val``.
22304
22305'``llvm.expect.with.probability``' Intrinsic
22306^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22307
22308Syntax:
22309"""""""
22310
22311This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
22312You can use ``llvm.expect.with.probability`` on any integer bit width.
22313
22314::
22315
22316      declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
22317      declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
22318      declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
22319
22320Overview:
22321"""""""""
22322
22323The ``llvm.expect.with.probability`` intrinsic provides information about
22324expected value of ``val`` with probability(or confidence) ``prob``, which can
22325be used by optimizers.
22326
22327Arguments:
22328""""""""""
22329
22330The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
22331argument is a value. The second argument is an expected value. The third
22332argument is a probability.
22333
22334Semantics:
22335""""""""""
22336
22337This intrinsic is lowered to the ``val``.
22338
22339.. _int_assume:
22340
22341'``llvm.assume``' Intrinsic
22342^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22343
22344Syntax:
22345"""""""
22346
22347::
22348
22349      declare void @llvm.assume(i1 %cond)
22350
22351Overview:
22352"""""""""
22353
22354The ``llvm.assume`` allows the optimizer to assume that the provided
22355condition is true. This information can then be used in simplifying other parts
22356of the code.
22357
22358More complex assumptions can be encoded as
22359:ref:`assume operand bundles <assume_opbundles>`.
22360
22361Arguments:
22362""""""""""
22363
22364The argument of the call is the condition which the optimizer may assume is
22365always true.
22366
22367Semantics:
22368""""""""""
22369
22370The intrinsic allows the optimizer to assume that the provided condition is
22371always true whenever the control flow reaches the intrinsic call. No code is
22372generated for this intrinsic, and instructions that contribute only to the
22373provided condition are not used for code generation. If the condition is
22374violated during execution, the behavior is undefined.
22375
22376Note that the optimizer might limit the transformations performed on values
22377used by the ``llvm.assume`` intrinsic in order to preserve the instructions
22378only used to form the intrinsic's input argument. This might prove undesirable
22379if the extra information provided by the ``llvm.assume`` intrinsic does not cause
22380sufficient overall improvement in code quality. For this reason,
22381``llvm.assume`` should not be used to document basic mathematical invariants
22382that the optimizer can otherwise deduce or facts that are of little use to the
22383optimizer.
22384
22385.. _int_ssa_copy:
22386
22387'``llvm.ssa.copy``' Intrinsic
22388^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22389
22390Syntax:
22391"""""""
22392
22393::
22394
22395      declare type @llvm.ssa.copy(type %operand) returned(1) readnone
22396
22397Arguments:
22398""""""""""
22399
22400The first argument is an operand which is used as the returned value.
22401
22402Overview:
22403""""""""""
22404
22405The ``llvm.ssa.copy`` intrinsic can be used to attach information to
22406operations by copying them and giving them new names.  For example,
22407the PredicateInfo utility uses it to build Extended SSA form, and
22408attach various forms of information to operands that dominate specific
22409uses.  It is not meant for general use, only for building temporary
22410renaming forms that require value splits at certain points.
22411
22412.. _type.test:
22413
22414'``llvm.type.test``' Intrinsic
22415^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22416
22417Syntax:
22418"""""""
22419
22420::
22421
22422      declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
22423
22424
22425Arguments:
22426""""""""""
22427
22428The first argument is a pointer to be tested. The second argument is a
22429metadata object representing a :doc:`type identifier <TypeMetadata>`.
22430
22431Overview:
22432"""""""""
22433
22434The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
22435with the given type identifier.
22436
22437.. _type.checked.load:
22438
22439'``llvm.type.checked.load``' Intrinsic
22440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22441
22442Syntax:
22443"""""""
22444
22445::
22446
22447      declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
22448
22449
22450Arguments:
22451""""""""""
22452
22453The first argument is a pointer from which to load a function pointer. The
22454second argument is the byte offset from which to load the function pointer. The
22455third argument is a metadata object representing a :doc:`type identifier
22456<TypeMetadata>`.
22457
22458Overview:
22459"""""""""
22460
22461The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
22462virtual table pointer using type metadata. This intrinsic is used to implement
22463control flow integrity in conjunction with virtual call optimization. The
22464virtual call optimization pass will optimize away ``llvm.type.checked.load``
22465intrinsics associated with devirtualized calls, thereby removing the type
22466check in cases where it is not needed to enforce the control flow integrity
22467constraint.
22468
22469If the given pointer is associated with a type metadata identifier, this
22470function returns true as the second element of its return value. (Note that
22471the function may also return true if the given pointer is not associated
22472with a type metadata identifier.) If the function's return value's second
22473element is true, the following rules apply to the first element:
22474
22475- If the given pointer is associated with the given type metadata identifier,
22476  it is the function pointer loaded from the given byte offset from the given
22477  pointer.
22478
22479- If the given pointer is not associated with the given type metadata
22480  identifier, it is one of the following (the choice of which is unspecified):
22481
22482  1. The function pointer that would have been loaded from an arbitrarily chosen
22483     (through an unspecified mechanism) pointer associated with the type
22484     metadata.
22485
22486  2. If the function has a non-void return type, a pointer to a function that
22487     returns an unspecified value without causing side effects.
22488
22489If the function's return value's second element is false, the value of the
22490first element is undefined.
22491
22492
22493'``llvm.arithmetic.fence``' Intrinsic
22494^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22495
22496Syntax:
22497"""""""
22498
22499::
22500
22501      declare <type>
22502      @llvm.arithmetic.fence(<type> <op>)
22503
22504Overview:
22505"""""""""
22506
22507The purpose of the ``llvm.arithmetic.fence`` intrinsic
22508is to prevent the optimizer from performing fast-math optimizations,
22509particularly reassociation,
22510between the argument and the expression that contains the argument.
22511It can be used to preserve the parentheses in the source language.
22512
22513Arguments:
22514""""""""""
22515
22516The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
22517The argument and the return value are floating-point numbers,
22518or vector floating-point numbers, of the same type.
22519
22520Semantics:
22521""""""""""
22522
22523This intrinsic returns the value of its operand. The optimizer can optimize
22524the argument, but the optimizer cannot hoist any component of the operand
22525to the containing context, and the optimizer cannot move the calculation of
22526any expression in the containing context into the operand.
22527
22528
22529'``llvm.donothing``' Intrinsic
22530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22531
22532Syntax:
22533"""""""
22534
22535::
22536
22537      declare void @llvm.donothing() nounwind readnone
22538
22539Overview:
22540"""""""""
22541
22542The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
22543three intrinsics (besides ``llvm.experimental.patchpoint`` and
22544``llvm.experimental.gc.statepoint``) that can be called with an invoke
22545instruction.
22546
22547Arguments:
22548""""""""""
22549
22550None.
22551
22552Semantics:
22553""""""""""
22554
22555This intrinsic does nothing, and it's removed by optimizers and ignored
22556by codegen.
22557
22558'``llvm.experimental.deoptimize``' Intrinsic
22559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22560
22561Syntax:
22562"""""""
22563
22564::
22565
22566      declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
22567
22568Overview:
22569"""""""""
22570
22571This intrinsic, together with :ref:`deoptimization operand bundles
22572<deopt_opbundles>`, allow frontends to express transfer of control and
22573frame-local state from the currently executing (typically more specialized,
22574hence faster) version of a function into another (typically more generic, hence
22575slower) version.
22576
22577In languages with a fully integrated managed runtime like Java and JavaScript
22578this intrinsic can be used to implement "uncommon trap" or "side exit" like
22579functionality.  In unmanaged languages like C and C++, this intrinsic can be
22580used to represent the slow paths of specialized functions.
22581
22582
22583Arguments:
22584""""""""""
22585
22586The intrinsic takes an arbitrary number of arguments, whose meaning is
22587decided by the :ref:`lowering strategy<deoptimize_lowering>`.
22588
22589Semantics:
22590""""""""""
22591
22592The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
22593deoptimization continuation (denoted using a :ref:`deoptimization
22594operand bundle <deopt_opbundles>`) and returns the value returned by
22595the deoptimization continuation.  Defining the semantic properties of
22596the continuation itself is out of scope of the language reference --
22597as far as LLVM is concerned, the deoptimization continuation can
22598invoke arbitrary side effects, including reading from and writing to
22599the entire heap.
22600
22601Deoptimization continuations expressed using ``"deopt"`` operand bundles always
22602continue execution to the end of the physical frame containing them, so all
22603calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
22604
22605   - ``@llvm.experimental.deoptimize`` cannot be invoked.
22606   - The call must immediately precede a :ref:`ret <i_ret>` instruction.
22607   - The ``ret`` instruction must return the value produced by the
22608     ``@llvm.experimental.deoptimize`` call if there is one, or void.
22609
22610Note that the above restrictions imply that the return type for a call to
22611``@llvm.experimental.deoptimize`` will match the return type of its immediate
22612caller.
22613
22614The inliner composes the ``"deopt"`` continuations of the caller into the
22615``"deopt"`` continuations present in the inlinee, and also updates calls to this
22616intrinsic to return directly from the frame of the function it inlined into.
22617
22618All declarations of ``@llvm.experimental.deoptimize`` must share the
22619same calling convention.
22620
22621.. _deoptimize_lowering:
22622
22623Lowering:
22624"""""""""
22625
22626Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
22627symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
22628ensure that this symbol is defined).  The call arguments to
22629``@llvm.experimental.deoptimize`` are lowered as if they were formal
22630arguments of the specified types, and not as varargs.
22631
22632
22633'``llvm.experimental.guard``' Intrinsic
22634^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22635
22636Syntax:
22637"""""""
22638
22639::
22640
22641      declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
22642
22643Overview:
22644"""""""""
22645
22646This intrinsic, together with :ref:`deoptimization operand bundles
22647<deopt_opbundles>`, allows frontends to express guards or checks on
22648optimistic assumptions made during compilation.  The semantics of
22649``@llvm.experimental.guard`` is defined in terms of
22650``@llvm.experimental.deoptimize`` -- its body is defined to be
22651equivalent to:
22652
22653.. code-block:: text
22654
22655  define void @llvm.experimental.guard(i1 %pred, <args...>) {
22656    %realPred = and i1 %pred, undef
22657    br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
22658
22659  leave:
22660    call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
22661    ret void
22662
22663  continue:
22664    ret void
22665  }
22666
22667
22668with the optional ``[, !make.implicit !{}]`` present if and only if it
22669is present on the call site.  For more details on ``!make.implicit``,
22670see :doc:`FaultMaps`.
22671
22672In words, ``@llvm.experimental.guard`` executes the attached
22673``"deopt"`` continuation if (but **not** only if) its first argument
22674is ``false``.  Since the optimizer is allowed to replace the ``undef``
22675with an arbitrary value, it can optimize guard to fail "spuriously",
22676i.e. without the original condition being false (hence the "not only
22677if"); and this allows for "check widening" type optimizations.
22678
22679``@llvm.experimental.guard`` cannot be invoked.
22680
22681After ``@llvm.experimental.guard`` was first added, a more general
22682formulation was found in ``@llvm.experimental.widenable.condition``.
22683Support for ``@llvm.experimental.guard`` is slowly being rephrased in
22684terms of this alternate.
22685
22686'``llvm.experimental.widenable.condition``' Intrinsic
22687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22688
22689Syntax:
22690"""""""
22691
22692::
22693
22694      declare i1 @llvm.experimental.widenable.condition()
22695
22696Overview:
22697"""""""""
22698
22699This intrinsic represents a "widenable condition" which is
22700boolean expressions with the following property: whether this
22701expression is `true` or `false`, the program is correct and
22702well-defined.
22703
22704Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
22705``@llvm.experimental.widenable.condition`` allows frontends to
22706express guards or checks on optimistic assumptions made during
22707compilation and represent them as branch instructions on special
22708conditions.
22709
22710While this may appear similar in semantics to `undef`, it is very
22711different in that an invocation produces a particular, singular
22712value. It is also intended to be lowered late, and remain available
22713for specific optimizations and transforms that can benefit from its
22714special properties.
22715
22716Arguments:
22717""""""""""
22718
22719None.
22720
22721Semantics:
22722""""""""""
22723
22724The intrinsic ``@llvm.experimental.widenable.condition()``
22725returns either `true` or `false`. For each evaluation of a call
22726to this intrinsic, the program must be valid and correct both if
22727it returns `true` and if it returns `false`. This allows
22728transformation passes to replace evaluations of this intrinsic
22729with either value whenever one is beneficial.
22730
22731When used in a branch condition, it allows us to choose between
22732two alternative correct solutions for the same problem, like
22733in example below:
22734
22735.. code-block:: text
22736
22737    %cond = call i1 @llvm.experimental.widenable.condition()
22738    br i1 %cond, label %solution_1, label %solution_2
22739
22740  label %fast_path:
22741    ; Apply memory-consuming but fast solution for a task.
22742
22743  label %slow_path:
22744    ; Cheap in memory but slow solution.
22745
22746Whether the result of intrinsic's call is `true` or `false`,
22747it should be correct to pick either solution. We can switch
22748between them by replacing the result of
22749``@llvm.experimental.widenable.condition`` with different
22750`i1` expressions.
22751
22752This is how it can be used to represent guards as widenable branches:
22753
22754.. code-block:: text
22755
22756  block:
22757    ; Unguarded instructions
22758    call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
22759    ; Guarded instructions
22760
22761Can be expressed in an alternative equivalent form of explicit branch using
22762``@llvm.experimental.widenable.condition``:
22763
22764.. code-block:: text
22765
22766  block:
22767    ; Unguarded instructions
22768    %widenable_condition = call i1 @llvm.experimental.widenable.condition()
22769    %guard_condition = and i1 %cond, %widenable_condition
22770    br i1 %guard_condition, label %guarded, label %deopt
22771
22772  guarded:
22773    ; Guarded instructions
22774
22775  deopt:
22776    call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
22777
22778So the block `guarded` is only reachable when `%cond` is `true`,
22779and it should be valid to go to the block `deopt` whenever `%cond`
22780is `true` or `false`.
22781
22782``@llvm.experimental.widenable.condition`` will never throw, thus
22783it cannot be invoked.
22784
22785Guard widening:
22786"""""""""""""""
22787
22788When ``@llvm.experimental.widenable.condition()`` is used in
22789condition of a guard represented as explicit branch, it is
22790legal to widen the guard's condition with any additional
22791conditions.
22792
22793Guard widening looks like replacement of
22794
22795.. code-block:: text
22796
22797  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
22798  %guard_cond = and i1 %cond, %widenable_cond
22799  br i1 %guard_cond, label %guarded, label %deopt
22800
22801with
22802
22803.. code-block:: text
22804
22805  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
22806  %new_cond = and i1 %any_other_cond, %widenable_cond
22807  %new_guard_cond = and i1 %cond, %new_cond
22808  br i1 %new_guard_cond, label %guarded, label %deopt
22809
22810for this branch. Here `%any_other_cond` is an arbitrarily chosen
22811well-defined `i1` value. By making guard widening, we may
22812impose stricter conditions on `guarded` block and bail to the
22813deopt when the new condition is not met.
22814
22815Lowering:
22816"""""""""
22817
22818Default lowering strategy is replacing the result of
22819call of ``@llvm.experimental.widenable.condition``  with
22820constant `true`. However it is always correct to replace
22821it with any other `i1` value. Any pass can
22822freely do it if it can benefit from non-default lowering.
22823
22824
22825'``llvm.load.relative``' Intrinsic
22826^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22827
22828Syntax:
22829"""""""
22830
22831::
22832
22833      declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
22834
22835Overview:
22836"""""""""
22837
22838This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
22839adds ``%ptr`` to that value and returns it. The constant folder specifically
22840recognizes the form of this intrinsic and the constant initializers it may
22841load from; if a loaded constant initializer is known to have the form
22842``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
22843
22844LLVM provides that the calculation of such a constant initializer will
22845not overflow at link time under the medium code model if ``x`` is an
22846``unnamed_addr`` function. However, it does not provide this guarantee for
22847a constant initializer folded into a function body. This intrinsic can be
22848used to avoid the possibility of overflows when loading from such a constant.
22849
22850.. _llvm_sideeffect:
22851
22852'``llvm.sideeffect``' Intrinsic
22853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22854
22855Syntax:
22856"""""""
22857
22858::
22859
22860      declare void @llvm.sideeffect() inaccessiblememonly nounwind
22861
22862Overview:
22863"""""""""
22864
22865The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
22866treat it as having side effects, so it can be inserted into a loop to
22867indicate that the loop shouldn't be assumed to terminate (which could
22868potentially lead to the loop being optimized away entirely), even if it's
22869an infinite loop with no other side effects.
22870
22871Arguments:
22872""""""""""
22873
22874None.
22875
22876Semantics:
22877""""""""""
22878
22879This intrinsic actually does nothing, but optimizers must assume that it
22880has externally observable side effects.
22881
22882'``llvm.is.constant.*``' Intrinsic
22883^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22884
22885Syntax:
22886"""""""
22887
22888This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
22889
22890::
22891
22892      declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone
22893      declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone
22894      declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone
22895
22896Overview:
22897"""""""""
22898
22899The '``llvm.is.constant``' intrinsic will return true if the argument
22900is known to be a manifest compile-time constant. It is guaranteed to
22901fold to either true or false before generating machine code.
22902
22903Semantics:
22904""""""""""
22905
22906This intrinsic generates no code. If its argument is known to be a
22907manifest compile-time constant value, then the intrinsic will be
22908converted to a constant true value. Otherwise, it will be converted to
22909a constant false value.
22910
22911In particular, note that if the argument is a constant expression
22912which refers to a global (the address of which _is_ a constant, but
22913not manifest during the compile), then the intrinsic evaluates to
22914false.
22915
22916The result also intentionally depends on the result of optimization
22917passes -- e.g., the result can change depending on whether a
22918function gets inlined or not. A function's parameters are
22919obviously not constant. However, a call like
22920``llvm.is.constant.i32(i32 %param)`` *can* return true after the
22921function is inlined, if the value passed to the function parameter was
22922a constant.
22923
22924On the other hand, if constant folding is not run, it will never
22925evaluate to true, even in simple cases.
22926
22927.. _int_ptrmask:
22928
22929'``llvm.ptrmask``' Intrinsic
22930^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22931
22932Syntax:
22933"""""""
22934
22935::
22936
22937      declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable
22938
22939Arguments:
22940""""""""""
22941
22942The first argument is a pointer. The second argument is an integer.
22943
22944Overview:
22945""""""""""
22946
22947The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
22948This allows stripping data from tagged pointers without converting them to an
22949integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
22950to facilitate alias analysis and underlying-object detection.
22951
22952Semantics:
22953""""""""""
22954
22955The result of ``ptrmask(ptr, mask)`` is equivalent to
22956``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
22957pointer and the first argument are based on the same underlying object (for more
22958information on the *based on* terminology see
22959:ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
22960mask argument does not match the pointer size of the target, the mask is
22961zero-extended or truncated accordingly.
22962
22963.. _int_vscale:
22964
22965'``llvm.vscale``' Intrinsic
22966^^^^^^^^^^^^^^^^^^^^^^^^^^^
22967
22968Syntax:
22969"""""""
22970
22971::
22972
22973      declare i32 llvm.vscale.i32()
22974      declare i64 llvm.vscale.i64()
22975
22976Overview:
22977"""""""""
22978
22979The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
22980vectors such as ``<vscale x 16 x i8>``.
22981
22982Semantics:
22983""""""""""
22984
22985``vscale`` is a positive value that is constant throughout program
22986execution, but is unknown at compile time.
22987If the result value does not fit in the result type, then the result is
22988a :ref:`poison value <poisonvalues>`.
22989
22990
22991Stack Map Intrinsics
22992--------------------
22993
22994LLVM provides experimental intrinsics to support runtime patching
22995mechanisms commonly desired in dynamic language JITs. These intrinsics
22996are described in :doc:`StackMaps`.
22997
22998Element Wise Atomic Memory Intrinsics
22999-------------------------------------
23000
23001These intrinsics are similar to the standard library memory intrinsics except
23002that they perform memory transfer as a sequence of atomic memory accesses.
23003
23004.. _int_memcpy_element_unordered_atomic:
23005
23006'``llvm.memcpy.element.unordered.atomic``' Intrinsic
23007^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23008
23009Syntax:
23010"""""""
23011
23012This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
23013any integer bit width and for different address spaces. Not all targets
23014support all bit widths however.
23015
23016::
23017
23018      declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23019                                                                       i8* <src>,
23020                                                                       i32 <len>,
23021                                                                       i32 <element_size>)
23022      declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23023                                                                       i8* <src>,
23024                                                                       i64 <len>,
23025                                                                       i32 <element_size>)
23026
23027Overview:
23028"""""""""
23029
23030The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
23031'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
23032as arrays with elements that are exactly ``element_size`` bytes, and the copy between
23033buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
23034that are a positive integer multiple of the ``element_size`` in size.
23035
23036Arguments:
23037""""""""""
23038
23039The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
23040intrinsic, with the added constraint that ``len`` is required to be a positive integer
23041multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23042``element_size``, then the behaviour of the intrinsic is undefined.
23043
23044``element_size`` must be a compile-time constant positive power of two no greater than
23045target-specific atomic access size limit.
23046
23047For each of the input pointers ``align`` parameter attribute must be specified. It
23048must be a power of two no less than the ``element_size``. Caller guarantees that
23049both the source and destination pointers are aligned to that boundary.
23050
23051Semantics:
23052""""""""""
23053
23054The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
23055memory from the source location to the destination location. These locations are not
23056allowed to overlap. The memory copy is performed as a sequence of load/store operations
23057where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
23058aligned at an ``element_size`` boundary.
23059
23060The order of the copy is unspecified. The same value may be read from the source
23061buffer many times, but only one write is issued to the destination buffer per
23062element. It is well defined to have concurrent reads and writes to both source and
23063destination provided those reads and writes are unordered atomic when specified.
23064
23065This intrinsic does not provide any additional ordering guarantees over those
23066provided by a set of unordered loads from the source location and stores to the
23067destination.
23068
23069Lowering:
23070"""""""""
23071
23072In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
23073lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
23074is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
23075lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23076lowering.
23077
23078Optimizer is allowed to inline memory copy when it's profitable to do so.
23079
23080'``llvm.memmove.element.unordered.atomic``' Intrinsic
23081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23082
23083Syntax:
23084"""""""
23085
23086This is an overloaded intrinsic. You can use
23087``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
23088different address spaces. Not all targets support all bit widths however.
23089
23090::
23091
23092      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23093                                                                        i8* <src>,
23094                                                                        i32 <len>,
23095                                                                        i32 <element_size>)
23096      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23097                                                                        i8* <src>,
23098                                                                        i64 <len>,
23099                                                                        i32 <element_size>)
23100
23101Overview:
23102"""""""""
23103
23104The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
23105of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
23106``src`` are treated as arrays with elements that are exactly ``element_size``
23107bytes, and the copy between buffers uses a sequence of
23108:ref:`unordered atomic <ordering>` load/store operations that are a positive
23109integer multiple of the ``element_size`` in size.
23110
23111Arguments:
23112""""""""""
23113
23114The first three arguments are the same as they are in the
23115:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
23116``len`` is required to be a positive integer multiple of the ``element_size``.
23117If ``len`` is not a positive integer multiple of ``element_size``, then the
23118behaviour of the intrinsic is undefined.
23119
23120``element_size`` must be a compile-time constant positive power of two no
23121greater than a target-specific atomic access size limit.
23122
23123For each of the input pointers the ``align`` parameter attribute must be
23124specified. It must be a power of two no less than the ``element_size``. Caller
23125guarantees that both the source and destination pointers are aligned to that
23126boundary.
23127
23128Semantics:
23129""""""""""
23130
23131The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
23132of memory from the source location to the destination location. These locations
23133are allowed to overlap. The memory copy is performed as a sequence of load/store
23134operations where each access is guaranteed to be a multiple of ``element_size``
23135bytes wide and aligned at an ``element_size`` boundary.
23136
23137The order of the copy is unspecified. The same value may be read from the source
23138buffer many times, but only one write is issued to the destination buffer per
23139element. It is well defined to have concurrent reads and writes to both source
23140and destination provided those reads and writes are unordered atomic when
23141specified.
23142
23143This intrinsic does not provide any additional ordering guarantees over those
23144provided by a set of unordered loads from the source location and stores to the
23145destination.
23146
23147Lowering:
23148"""""""""
23149
23150In the most general case call to the
23151'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
23152``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
23153actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
23154<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23155lowering.
23156
23157The optimizer is allowed to inline the memory copy when it's profitable to do so.
23158
23159.. _int_memset_element_unordered_atomic:
23160
23161'``llvm.memset.element.unordered.atomic``' Intrinsic
23162^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23163
23164Syntax:
23165"""""""
23166
23167This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
23168any integer bit width and for different address spaces. Not all targets
23169support all bit widths however.
23170
23171::
23172
23173      declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
23174                                                                  i8 <value>,
23175                                                                  i32 <len>,
23176                                                                  i32 <element_size>)
23177      declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
23178                                                                  i8 <value>,
23179                                                                  i64 <len>,
23180                                                                  i32 <element_size>)
23181
23182Overview:
23183"""""""""
23184
23185The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
23186'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
23187with elements that are exactly ``element_size`` bytes, and the assignment to that array
23188uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
23189that are a positive integer multiple of the ``element_size`` in size.
23190
23191Arguments:
23192""""""""""
23193
23194The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
23195intrinsic, with the added constraint that ``len`` is required to be a positive integer
23196multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23197``element_size``, then the behaviour of the intrinsic is undefined.
23198
23199``element_size`` must be a compile-time constant positive power of two no greater than
23200target-specific atomic access size limit.
23201
23202The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
23203must be a power of two no less than the ``element_size``. Caller guarantees that
23204the destination pointer is aligned to that boundary.
23205
23206Semantics:
23207""""""""""
23208
23209The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
23210memory starting at the destination location to the given ``value``. The memory is
23211set with a sequence of store operations where each access is guaranteed to be a
23212multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
23213
23214The order of the assignment is unspecified. Only one write is issued to the
23215destination buffer per element. It is well defined to have concurrent reads and
23216writes to the destination provided those reads and writes are unordered atomic
23217when specified.
23218
23219This intrinsic does not provide any additional ordering guarantees over those
23220provided by a set of unordered stores to the destination.
23221
23222Lowering:
23223"""""""""
23224
23225In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
23226lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
23227is replaced with an actual element size.
23228
23229The optimizer is allowed to inline the memory assignment when it's profitable to do so.
23230
23231Objective-C ARC Runtime Intrinsics
23232----------------------------------
23233
23234LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
23235LLVM is aware of the semantics of these functions, and optimizes based on that
23236knowledge. You can read more about the details of Objective-C ARC `here
23237<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
23238
23239'``llvm.objc.autorelease``' Intrinsic
23240^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23241
23242Syntax:
23243"""""""
23244::
23245
23246      declare i8* @llvm.objc.autorelease(i8*)
23247
23248Lowering:
23249"""""""""
23250
23251Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
23252
23253'``llvm.objc.autoreleasePoolPop``' Intrinsic
23254^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23255
23256Syntax:
23257"""""""
23258::
23259
23260      declare void @llvm.objc.autoreleasePoolPop(i8*)
23261
23262Lowering:
23263"""""""""
23264
23265Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
23266
23267'``llvm.objc.autoreleasePoolPush``' Intrinsic
23268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23269
23270Syntax:
23271"""""""
23272::
23273
23274      declare i8* @llvm.objc.autoreleasePoolPush()
23275
23276Lowering:
23277"""""""""
23278
23279Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
23280
23281'``llvm.objc.autoreleaseReturnValue``' Intrinsic
23282^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23283
23284Syntax:
23285"""""""
23286::
23287
23288      declare i8* @llvm.objc.autoreleaseReturnValue(i8*)
23289
23290Lowering:
23291"""""""""
23292
23293Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
23294
23295'``llvm.objc.copyWeak``' Intrinsic
23296^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23297
23298Syntax:
23299"""""""
23300::
23301
23302      declare void @llvm.objc.copyWeak(i8**, i8**)
23303
23304Lowering:
23305"""""""""
23306
23307Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
23308
23309'``llvm.objc.destroyWeak``' Intrinsic
23310^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23311
23312Syntax:
23313"""""""
23314::
23315
23316      declare void @llvm.objc.destroyWeak(i8**)
23317
23318Lowering:
23319"""""""""
23320
23321Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
23322
23323'``llvm.objc.initWeak``' Intrinsic
23324^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23325
23326Syntax:
23327"""""""
23328::
23329
23330      declare i8* @llvm.objc.initWeak(i8**, i8*)
23331
23332Lowering:
23333"""""""""
23334
23335Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
23336
23337'``llvm.objc.loadWeak``' Intrinsic
23338^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23339
23340Syntax:
23341"""""""
23342::
23343
23344      declare i8* @llvm.objc.loadWeak(i8**)
23345
23346Lowering:
23347"""""""""
23348
23349Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
23350
23351'``llvm.objc.loadWeakRetained``' Intrinsic
23352^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23353
23354Syntax:
23355"""""""
23356::
23357
23358      declare i8* @llvm.objc.loadWeakRetained(i8**)
23359
23360Lowering:
23361"""""""""
23362
23363Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
23364
23365'``llvm.objc.moveWeak``' Intrinsic
23366^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23367
23368Syntax:
23369"""""""
23370::
23371
23372      declare void @llvm.objc.moveWeak(i8**, i8**)
23373
23374Lowering:
23375"""""""""
23376
23377Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
23378
23379'``llvm.objc.release``' Intrinsic
23380^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23381
23382Syntax:
23383"""""""
23384::
23385
23386      declare void @llvm.objc.release(i8*)
23387
23388Lowering:
23389"""""""""
23390
23391Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
23392
23393'``llvm.objc.retain``' Intrinsic
23394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23395
23396Syntax:
23397"""""""
23398::
23399
23400      declare i8* @llvm.objc.retain(i8*)
23401
23402Lowering:
23403"""""""""
23404
23405Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
23406
23407'``llvm.objc.retainAutorelease``' Intrinsic
23408^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23409
23410Syntax:
23411"""""""
23412::
23413
23414      declare i8* @llvm.objc.retainAutorelease(i8*)
23415
23416Lowering:
23417"""""""""
23418
23419Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
23420
23421'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
23422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23423
23424Syntax:
23425"""""""
23426::
23427
23428      declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*)
23429
23430Lowering:
23431"""""""""
23432
23433Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
23434
23435'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
23436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23437
23438Syntax:
23439"""""""
23440::
23441
23442      declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*)
23443
23444Lowering:
23445"""""""""
23446
23447Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
23448
23449'``llvm.objc.retainBlock``' Intrinsic
23450^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23451
23452Syntax:
23453"""""""
23454::
23455
23456      declare i8* @llvm.objc.retainBlock(i8*)
23457
23458Lowering:
23459"""""""""
23460
23461Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
23462
23463'``llvm.objc.storeStrong``' Intrinsic
23464^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23465
23466Syntax:
23467"""""""
23468::
23469
23470      declare void @llvm.objc.storeStrong(i8**, i8*)
23471
23472Lowering:
23473"""""""""
23474
23475Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
23476
23477'``llvm.objc.storeWeak``' Intrinsic
23478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23479
23480Syntax:
23481"""""""
23482::
23483
23484      declare i8* @llvm.objc.storeWeak(i8**, i8*)
23485
23486Lowering:
23487"""""""""
23488
23489Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
23490
23491Preserving Debug Information Intrinsics
23492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23493
23494These intrinsics are used to carry certain debuginfo together with
23495IR-level operations. For example, it may be desirable to
23496know the structure/union name and the original user-level field
23497indices. Such information got lost in IR GetElementPtr instruction
23498since the IR types are different from debugInfo types and unions
23499are converted to structs in IR.
23500
23501'``llvm.preserve.array.access.index``' Intrinsic
23502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23503
23504Syntax:
23505"""""""
23506::
23507
23508      declare <ret_type>
23509      @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
23510                                                                           i32 dim,
23511                                                                           i32 index)
23512
23513Overview:
23514"""""""""
23515
23516The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
23517based on array base ``base``, array dimension ``dim`` and the last access index ``index``
23518into the array. The return type ``ret_type`` is a pointer type to the array element.
23519The array ``dim`` and ``index`` are preserved which is more robust than
23520getelementptr instruction which may be subject to compiler transformation.
23521The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23522to provide array or pointer debuginfo type.
23523The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
23524debuginfo version of ``type``.
23525
23526Arguments:
23527""""""""""
23528
23529The ``base`` is the array base address.  The ``dim`` is the array dimension.
23530The ``base`` is a pointer if ``dim`` equals 0.
23531The ``index`` is the last access index into the array or pointer.
23532
23533The ``base`` argument must be annotated with an :ref:`elementtype
23534<attr_elementtype>` attribute at the call-site. This attribute specifies the
23535getelementptr element type.
23536
23537Semantics:
23538""""""""""
23539
23540The '``llvm.preserve.array.access.index``' intrinsic produces the same result
23541as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
23542
23543'``llvm.preserve.union.access.index``' Intrinsic
23544^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23545
23546Syntax:
23547"""""""
23548::
23549
23550      declare <type>
23551      @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
23552                                                                        i32 di_index)
23553
23554Overview:
23555"""""""""
23556
23557The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
23558``di_index`` and returns the ``base`` address.
23559The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23560to provide union debuginfo type.
23561The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23562The return type ``type`` is the same as the ``base`` type.
23563
23564Arguments:
23565""""""""""
23566
23567The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
23568
23569Semantics:
23570""""""""""
23571
23572The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
23573
23574'``llvm.preserve.struct.access.index``' Intrinsic
23575^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23576
23577Syntax:
23578"""""""
23579::
23580
23581      declare <ret_type>
23582      @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
23583                                                                 i32 gep_index,
23584                                                                 i32 di_index)
23585
23586Overview:
23587"""""""""
23588
23589The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
23590based on struct base ``base`` and IR struct member index ``gep_index``.
23591The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23592to provide struct debuginfo type.
23593The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23594The return type ``ret_type`` is a pointer type to the structure member.
23595
23596Arguments:
23597""""""""""
23598
23599The ``base`` is the structure base address. The ``gep_index`` is the struct member index
23600based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
23601
23602The ``base`` argument must be annotated with an :ref:`elementtype
23603<attr_elementtype>` attribute at the call-site. This attribute specifies the
23604getelementptr element type.
23605
23606Semantics:
23607""""""""""
23608
23609The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
23610as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
23611