1=========================== 2LLVM Branch Weight Metadata 3=========================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11Branch Weight Metadata represents branch weights as its likeliness to be taken 12(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an 13``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind. 14The first operator is always a ``MDString`` node with the string 15"branch_weights". Number of operators depends on the terminator type. 16 17Branch weights might be fetch from the profiling file, or generated based on 18`__builtin_expect`_ and `__builtin_expect_with_probability`_ instruction. 19 20All weights are represented as an unsigned 32-bit values, where higher value 21indicates greater chance to be taken. 22 23Supported Instructions 24====================== 25 26``BranchInst`` 27^^^^^^^^^^^^^^ 28 29Metadata is only assigned to the conditional branches. There are two extra 30operands for the true and the false branch. 31 32.. code-block:: none 33 34 !0 = metadata !{ 35 metadata !"branch_weights", 36 i32 <TRUE_BRANCH_WEIGHT>, 37 i32 <FALSE_BRANCH_WEIGHT> 38 } 39 40``SwitchInst`` 41^^^^^^^^^^^^^^ 42 43Branch weights are assigned to every case (including the ``default`` case which 44is always case #0). 45 46.. code-block:: none 47 48 !0 = metadata !{ 49 metadata !"branch_weights", 50 i32 <DEFAULT_BRANCH_WEIGHT> 51 [ , i32 <CASE_BRANCH_WEIGHT> ... ] 52 } 53 54``IndirectBrInst`` 55^^^^^^^^^^^^^^^^^^ 56 57Branch weights are assigned to every destination. 58 59.. code-block:: none 60 61 !0 = metadata !{ 62 metadata !"branch_weights", 63 i32 <LABEL_BRANCH_WEIGHT> 64 [ , i32 <LABEL_BRANCH_WEIGHT> ... ] 65 } 66 67``CallInst`` 68^^^^^^^^^^^^^^^^^^ 69 70Calls may have branch weight metadata, containing the execution count of 71the call. It is currently used in SamplePGO mode only, to augment the 72block and entry counts which may not be accurate with sampling. 73 74.. code-block:: none 75 76 !0 = metadata !{ 77 metadata !"branch_weights", 78 i32 <CALL_BRANCH_WEIGHT> 79 } 80 81``InvokeInst`` 82^^^^^^^^^^^^^^^^^^ 83 84Invoke instruction may have branch weight metadata with one or two weights. 85The second weight is optional and corresponds to the unwind branch. 86If only one weight is set then it contains the execution count of the call 87and used in SamplePGO mode only as described for the call instruction. If both 88weights are specified then the second weight contains count of unwind branch 89taken and the first weights contains the execution count of the call minus 90the count of unwind branch taken. Both weights specified are used to calculate 91BranchProbability as for BranchInst and for SamplePGO the sum of both weights 92is used. 93 94.. code-block:: none 95 96 !0 = metadata !{ 97 metadata !"branch_weights", 98 i32 <INVOKE_NORMAL_WEIGHT> 99 [ , i32 <INVOKE_UNWIND_WEIGHT> ] 100 } 101 102Other 103^^^^^ 104 105Other terminator instructions are not allowed to contain Branch Weight Metadata. 106 107.. _\__builtin_expect: 108 109Built-in ``expect`` Instructions 110================================ 111 112``__builtin_expect(long exp, long c)`` instruction provides branch prediction 113information. The return value is the value of ``exp``. 114 115It is especially useful in conditional statements. Currently Clang supports two 116conditional statements: 117 118``if`` statement 119^^^^^^^^^^^^^^^^ 120 121The ``exp`` parameter is the condition. The ``c`` parameter is the expected 122comparison value. If it is equal to 1 (true), the condition is likely to be 123true, in other case condition is likely to be false. For example: 124 125.. code-block:: c++ 126 127 if (__builtin_expect(x > 0, 1)) { 128 // This block is likely to be taken. 129 } 130 131``switch`` statement 132^^^^^^^^^^^^^^^^^^^^ 133 134The ``exp`` parameter is the value. The ``c`` parameter is the expected 135value. If the expected value doesn't show on the cases list, the ``default`` 136case is assumed to be likely taken. 137 138.. code-block:: c++ 139 140 switch (__builtin_expect(x, 5)) { 141 default: break; 142 case 0: // ... 143 case 3: // ... 144 case 5: // This case is likely to be taken. 145 } 146 147.. _\__builtin_expect_with_probability: 148 149Built-in ``expect.with.probability`` Instruction 150================================================ 151 152``__builtin_expect_with_probability(long exp, long c, double probability)`` has 153the same semantics as ``__builtin_expect``, but the caller provides the 154probability that ``exp == c``. The last argument ``probability`` must be 155constant floating-point expression and be in the range [0.0, 1.0] inclusive. 156The usage is also similar as ``__builtin_expect``, for example: 157 158``if`` statement 159^^^^^^^^^^^^^^^^ 160 161If the expect comparison value ``c`` is equal to 1(true), and probability 162value ``probability`` is set to 0.8, that means the probability of condition 163to be true is 80% while that of false is 20%. 164 165.. code-block:: c++ 166 167 if (__builtin_expect_with_probability(x > 0, 1, 0.8)) { 168 // This block is likely to be taken with probability 80%. 169 } 170 171``switch`` statement 172^^^^^^^^^^^^^^^^^^^^ 173 174This is basically the same as ``switch`` statement in ``__builtin_expect``. 175The probability that ``exp`` is equal to the expect value is given in 176the third argument ``probability``, while the probability of other value is 177the average of remaining probability(``1.0 - probability``). For example: 178 179.. code-block:: c++ 180 181 switch (__builtin_expect_with_probability(x, 5, 0.7)) { 182 default: break; // Take this case with probability 10% 183 case 0: break; // Take this case with probability 10% 184 case 3: break; // Take this case with probability 10% 185 case 5: break; // This case is likely to be taken with probability 70% 186 } 187 188CFG Modifications 189================= 190 191Branch Weight Metatada is not proof against CFG changes. If terminator operands' 192are changed some action should be taken. In other case some misoptimizations may 193occur due to incorrect branch prediction information. 194 195Function Entry Counts 196===================== 197 198To allow comparing different functions during inter-procedural analysis and 199optimization, ``MD_prof`` nodes can also be assigned to a function definition. 200The first operand is a string indicating the name of the associated counter. 201 202Currently, one counter is supported: "function_entry_count". The second operand 203is a 64-bit counter that indicates the number of times that this function was 204invoked (in the case of instrumentation-based profiles). In the case of 205sampling-based profiles, this operand is an approximation of how many times 206the function was invoked. 207 208For example, in the code below, the instrumentation for function foo() 209indicates that it was called 2,590 times at runtime. 210 211.. code-block:: llvm 212 213 define i32 @foo() !prof !1 { 214 ret i32 0 215 } 216 !1 = !{!"function_entry_count", i64 2590} 217 218If "function_entry_count" has more than 2 operands, the later operands are 219the GUID of the functions that needs to be imported by ThinLTO. This is only 220set by sampling based profile. It is needed because the sampling based profile 221was collected on a binary that had already imported and inlined these functions, 222and we need to ensure the IR matches in the ThinLTO backends for profile 223annotation. The reason why we cannot annotate this on the callsite is that it 224can only goes down 1 level in the call chain. For the cases where 225foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels 226in the call chain to import both bar_in_b_cc and baz_in_c_cc. 227