1===========================
2LLVM Branch Weight Metadata
3===========================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11Branch Weight Metadata represents branch weights as its likeliness to be taken
12(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an
13``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind.
14The first operator is always a ``MDString`` node with the string
15"branch_weights".  Number of operators depends on the terminator type.
16
17Branch weights might be fetch from the profiling file, or generated based on
18`__builtin_expect`_ and `__builtin_expect_with_probability`_ instruction.
19
20All weights are represented as an unsigned 32-bit values, where higher value
21indicates greater chance to be taken.
22
23Supported Instructions
24======================
25
26``BranchInst``
27^^^^^^^^^^^^^^
28
29Metadata is only assigned to the conditional branches. There are two extra
30operands for the true and the false branch.
31
32.. code-block:: none
33
34  !0 = metadata !{
35    metadata !"branch_weights",
36    i32 <TRUE_BRANCH_WEIGHT>,
37    i32 <FALSE_BRANCH_WEIGHT>
38  }
39
40``SwitchInst``
41^^^^^^^^^^^^^^
42
43Branch weights are assigned to every case (including the ``default`` case which
44is always case #0).
45
46.. code-block:: none
47
48  !0 = metadata !{
49    metadata !"branch_weights",
50    i32 <DEFAULT_BRANCH_WEIGHT>
51    [ , i32 <CASE_BRANCH_WEIGHT> ... ]
52  }
53
54``IndirectBrInst``
55^^^^^^^^^^^^^^^^^^
56
57Branch weights are assigned to every destination.
58
59.. code-block:: none
60
61  !0 = metadata !{
62    metadata !"branch_weights",
63    i32 <LABEL_BRANCH_WEIGHT>
64    [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
65  }
66
67``CallInst``
68^^^^^^^^^^^^^^^^^^
69
70Calls may have branch weight metadata, containing the execution count of
71the call. It is currently used in SamplePGO mode only, to augment the
72block and entry counts which may not be accurate with sampling.
73
74.. code-block:: none
75
76  !0 = metadata !{
77    metadata !"branch_weights",
78    i32 <CALL_BRANCH_WEIGHT>
79  }
80
81``InvokeInst``
82^^^^^^^^^^^^^^^^^^
83
84Invoke instruction may have branch weight metadata with one or two weights.
85The second weight is optional and corresponds to the unwind branch.
86If only one weight is set then it contains the execution count of the call
87and used in SamplePGO mode only as described for the call instruction. If both
88weights are specified then the second weight contains count of unwind branch
89taken and the first weights contains the execution count of the call minus
90the count of unwind branch taken. Both weights specified are used to calculate
91BranchProbability as for BranchInst and for SamplePGO the sum of both weights
92is used.
93
94.. code-block:: none
95
96  !0 = metadata !{
97    metadata !"branch_weights",
98    i32 <INVOKE_NORMAL_WEIGHT>
99    [ , i32 <INVOKE_UNWIND_WEIGHT> ]
100  }
101
102Other
103^^^^^
104
105Other terminator instructions are not allowed to contain Branch Weight Metadata.
106
107.. _\__builtin_expect:
108
109Built-in ``expect`` Instructions
110================================
111
112``__builtin_expect(long exp, long c)`` instruction provides branch prediction
113information. The return value is the value of ``exp``.
114
115It is especially useful in conditional statements. Currently Clang supports two
116conditional statements:
117
118``if`` statement
119^^^^^^^^^^^^^^^^
120
121The ``exp`` parameter is the condition. The ``c`` parameter is the expected
122comparison value. If it is equal to 1 (true), the condition is likely to be
123true, in other case condition is likely to be false. For example:
124
125.. code-block:: c++
126
127  if (__builtin_expect(x > 0, 1)) {
128    // This block is likely to be taken.
129  }
130
131``switch`` statement
132^^^^^^^^^^^^^^^^^^^^
133
134The ``exp`` parameter is the value. The ``c`` parameter is the expected
135value. If the expected value doesn't show on the cases list, the ``default``
136case is assumed to be likely taken.
137
138.. code-block:: c++
139
140  switch (__builtin_expect(x, 5)) {
141  default: break;
142  case 0:  // ...
143  case 3:  // ...
144  case 5:  // This case is likely to be taken.
145  }
146
147.. _\__builtin_expect_with_probability:
148
149Built-in ``expect.with.probability`` Instruction
150================================================
151
152``__builtin_expect_with_probability(long exp, long c, double probability)`` has
153the same semantics as ``__builtin_expect``, but the caller provides the
154probability that ``exp == c``. The last argument ``probability`` must be
155constant floating-point expression and be in the range [0.0, 1.0] inclusive.
156The usage is also similar as ``__builtin_expect``, for example:
157
158``if`` statement
159^^^^^^^^^^^^^^^^
160
161If the expect comparison value ``c`` is equal to 1(true), and probability
162value ``probability`` is set to 0.8, that means the probability of condition
163to be true is 80% while that of false is 20%.
164
165.. code-block:: c++
166
167  if (__builtin_expect_with_probability(x > 0, 1, 0.8)) {
168    // This block is likely to be taken with probability 80%.
169  }
170
171``switch`` statement
172^^^^^^^^^^^^^^^^^^^^
173
174This is basically the same as ``switch`` statement in ``__builtin_expect``.
175The probability that ``exp`` is equal to the expect value is given in
176the third argument ``probability``, while the probability of other value is
177the average of remaining probability(``1.0 - probability``). For example:
178
179.. code-block:: c++
180
181  switch (__builtin_expect_with_probability(x, 5, 0.7)) {
182  default: break;  // Take this case with probability 10%
183  case 0:  break;  // Take this case with probability 10%
184  case 3:  break;  // Take this case with probability 10%
185  case 5:  break;  // This case is likely to be taken with probability 70%
186  }
187
188CFG Modifications
189=================
190
191Branch Weight Metatada is not proof against CFG changes. If terminator operands'
192are changed some action should be taken. In other case some misoptimizations may
193occur due to incorrect branch prediction information.
194
195Function Entry Counts
196=====================
197
198To allow comparing different functions during inter-procedural analysis and
199optimization, ``MD_prof`` nodes can also be assigned to a function definition.
200The first operand is a string indicating the name of the associated counter.
201
202Currently, one counter is supported: "function_entry_count". The second operand
203is a 64-bit counter that indicates the number of times that this function was
204invoked (in the case of instrumentation-based profiles). In the case of
205sampling-based profiles, this operand is an approximation of how many times
206the function was invoked.
207
208For example, in the code below, the instrumentation for function foo()
209indicates that it was called 2,590 times at runtime.
210
211.. code-block:: llvm
212
213  define i32 @foo() !prof !1 {
214    ret i32 0
215  }
216  !1 = !{!"function_entry_count", i64 2590}
217
218If "function_entry_count" has more than 2 operands, the later operands are
219the GUID of the functions that needs to be imported by ThinLTO. This is only
220set by sampling based profile. It is needed because the sampling based profile
221was collected on a binary that had already imported and inlined these functions,
222and we need to ensure the IR matches in the ThinLTO backends for profile
223annotation. The reason why we cannot annotate this on the callsite is that it
224can only goes down 1 level in the call chain. For the cases where
225foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels
226in the call chain to import both bar_in_b_cc and baz_in_c_cc.
227