1Perfetto Tracing
2================
3
4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for
5GPU performance monitoring.  Perfetto supports multiple
6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with
7one or more data-sources.  Perfetto already provides various producers and
8data-sources for things like:
9
10- CPU scheduling events (``linux.ftrace``)
11- CPU frequency scaling (``linux.ftrace``)
12- System calls (``linux.ftrace``)
13- Process memory utilization (``linux.process_stats``)
14
15As well as various domain specific producers.
16
17The mesa perfetto support adds additional producers, to allow for visualizing
18GPU performance (frequency, utilization, performance counters, etc) on the
19same timeline, to better understand and tune/debug system level performance:
20
21- pps-producer: A systemwide daemon that can collect global performance
22  counters.
23- mesa: Per-process producer within mesa to capture render-stage traces
24  on the GPU timeline, track events, etc.
25
26The exact supported features vary per driver:
27
28.. list-table:: Supported data-sources
29   :header-rows: 1
30
31   * - Driver
32     - PPS Counters
33     - Render Stages
34   * - Freedreno
35     - ``gpu.counters.msm``
36     - ``gpu.renderstages.msm``
37   * - Turnip
38     - ``gpu.counters.msm``
39     -
40   * - Intel
41     - ``gpu.counters.i915``
42     -
43   * - Panfrost
44     - ``gpu.counters.panfrost``
45     -
46
47Run
48---
49
50To capture a trace with perfetto you need to take the following steps:
51
521. Build perfetto from sources available at ``subprojects/perfetto`` following
53   `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__.
54
552. Create a `trace config <https://perfetto.dev/#/trace-config.md>`__, which is
56   a json formatted text file with extension ``.cfg``, or use one of the config
57   files under the ``src/tool/pps/cfg`` directory. More examples of config files
58   can be found in ``subprojects/perfetto/test/configs``.
59
603. Change directory to ``subprojects/perfetto`` and run a
61   `convenience script <https://perfetto.dev/#/running.md>`__ to start the
62   tracing service:
63
64   .. code-block:: console
65
66      cd subprojects/perfetto
67      CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n
68
694. Start other producers you may need, e.g. ``pps-producer``.
70
715. Start ``perfetto`` under the tmux session initiated in step 3.
72
736. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`,
74   :kbd:`d`, and the convenience script should automatically copy the trace
75   files into ``$HOME/Downloads``.
76
777. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload
78   ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**.
79
808. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__
81   (which despite the name can be used to view non-android traces).
82
83Driver Specifics
84~~~~~~~~~~~~~~~~
85
86Below is driver specific information/instructions for the PPS producer.
87
88Freedreno / Turnip
89^^^^^^^^^^^^^^^^^^
90
91The Freedreno PPS driver needs root access to read system-wide
92performance counters, so you can simply run it with sudo:
93
94.. code-block:: console
95
96   sudo ./build/src/tool/pps/pps-producer
97
98Intel
99^^^^^
100
101The Intel PPS driver needs root access to read system-wide
102`RenderBasic <https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/gpu-metrics-reference.html>`__
103performance counters, so you can simply run it with sudo:
104
105.. code-block:: console
106
107   sudo ./build/src/tool/pps/pps-producer
108
109Another option to enable access wide data without root permissions would be running the following:
110
111.. code-block:: console
112
113   sudo sysctl dev.i915.perf_stream_paranoid=0
114
115Alternatively using the ``CAP_PERFMON`` permission on the binary should work too.
116
117Panfrost
118^^^^^^^^
119
120The Panfrost PPS driver uses unstable ioctls that behave correctly on
121kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and
122`5.5.7+ <https://lwn.net/Articles/813600/>`__.
123
124To run the producer, follow these two simple steps:
125
1261. Enable Panfrost unstable ioctls via kernel parameter:
127
128   .. code-block:: console
129
130      modprobe panfrost unstable_ioctls=1
131
132   Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``.
133
1342. Run the producer:
135
136   .. code-block:: console
137
138      ./build/pps-producer
139
140Troubleshooting
141---------------
142
143Tmux
144~~~~
145
146If the convenience script ``tools/tmux`` keeps copying artifacts to your
147``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux``
148installed in your system.
149
150.. code-block:: console
151
152   apt install tmux
153
154Missing counter names
155~~~~~~~~~~~~~~~~~~~~~
156
157If the trace viewer shows a list of counters with a description like
158``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due
159to the trace buffer being full and wrapped.
160
161In order to prevent this loss of data you can tweak the trace config file in
162two different ways:
163
164- Increase the size of the buffer in use:
165
166  .. code-block:: javascript
167
168      buffers {
169          size_kb: 2048,
170          fill_policy: RING_BUFFER,
171      }
172
173- Periodically flush the trace buffer into the output file:
174
175  .. code-block:: javascript
176
177      write_into_file: true
178      file_write_period_ms: 250
179
180
181- Discard new traces when the buffer fills:
182
183  .. code-block:: javascript
184
185      buffers {
186          size_kb: 2048,
187          fill_policy: DISCARD,
188      }
189