1<!--
2doc/src/sgml/ref/pgtesttiming.sgml
3PostgreSQL documentation
4-->
5
6<refentry id="pgtesttiming">
7 <indexterm zone="pgtesttiming">
8  <primary>pg_test_timing</primary>
9 </indexterm>
10
11 <refmeta>
12  <refentrytitle><application>pg_test_timing</application></refentrytitle>
13  <manvolnum>1</manvolnum>
14  <refmiscinfo>Application</refmiscinfo>
15 </refmeta>
16
17 <refnamediv>
18  <refname>pg_test_timing</refname>
19  <refpurpose>measure timing overhead</refpurpose>
20 </refnamediv>
21
22 <refsynopsisdiv>
23  <cmdsynopsis>
24   <command>pg_test_timing</command>
25   <arg rep="repeat"><replaceable>option</replaceable></arg>
26  </cmdsynopsis>
27 </refsynopsisdiv>
28
29 <refsect1>
30  <title>Description</title>
31
32 <para>
33  <application>pg_test_timing</> is a tool to measure the timing overhead
34  on your system and confirm that the system time never moves backwards.
35  Systems that are slow to collect timing data can give less accurate
36  <command>EXPLAIN ANALYZE</command> results.
37 </para>
38 </refsect1>
39
40 <refsect1>
41  <title>Options</title>
42
43   <para>
44    <application>pg_test_timing</application> accepts the following
45    command-line options:
46
47    <variablelist>
48
49     <varlistentry>
50      <term><option>-d <replaceable class="parameter">duration</replaceable></option></term>
51      <term><option>--duration=<replaceable class="parameter">duration</replaceable></option></term>
52      <listitem>
53       <para>
54        Specifies the test duration, in seconds. Longer durations
55        give slightly better accuracy, and are more likely to discover
56        problems with the system clock moving backwards. The default
57        test duration is 3 seconds.
58       </para>
59      </listitem>
60     </varlistentry>
61
62     <varlistentry>
63      <term><option>-V</></term>
64      <term><option>--version</></term>
65      <listitem>
66       <para>
67        Print the <application>pg_test_timing</application> version and exit.
68       </para>
69      </listitem>
70     </varlistentry>
71
72     <varlistentry>
73      <term><option>-?</></term>
74      <term><option>--help</></term>
75      <listitem>
76       <para>
77        Show help about <application>pg_test_timing</application> command line
78        arguments, and exit.
79       </para>
80      </listitem>
81     </varlistentry>
82
83    </variablelist>
84   </para>
85
86 </refsect1>
87
88 <refsect1>
89  <title>Usage</title>
90
91 <refsect2>
92  <title>Interpreting results</title>
93
94  <para>
95   Good results will show most (>90%) individual timing calls take less than
96   one microsecond. Average per loop overhead will be even lower, below 100
97   nanoseconds. This example from an Intel i7-860 system using a TSC clock
98   source shows excellent performance:
99
100<screen>
101Testing timing overhead for 3 seconds.
102Per loop time including overhead: 35.96 ns
103Histogram of timing durations:
104  < us   % of total      count
105     1     96.40465   80435604
106     2      3.59518    2999652
107     4      0.00015        126
108     8      0.00002         13
109    16      0.00000          2
110</screen>
111  </para>
112
113  <para>
114   Note that different units are used for the per loop time than the
115   histogram. The loop can have resolution within a few nanoseconds (ns),
116   while the individual timing calls can only resolve down to one microsecond
117   (us).
118  </para>
119
120 </refsect2>
121 <refsect2>
122  <title>Measuring executor timing overhead</title>
123
124  <para>
125   When the query executor is running a statement using
126   <command>EXPLAIN ANALYZE</command>, individual operations are timed as well
127   as showing a summary.  The overhead of your system can be checked by
128   counting rows with the <application>psql</application> program:
129
130<screen>
131CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
132\timing
133SELECT COUNT(*) FROM t;
134EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
135</screen>
136  </para>
137
138  <para>
139   The i7-860 system measured runs the count query in 9.8 ms while
140   the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each
141   processing just over 100,000 rows.  That 6.8 ms difference means the timing
142   overhead per row is 68 ns, about twice what pg_test_timing estimated it
143   would be.  Even that relatively small amount of overhead is making the fully
144   timed count statement take almost 70% longer.  On more substantial queries,
145   the timing overhead would be less problematic.
146  </para>
147
148 </refsect2>
149
150 <refsect2>
151  <title>Changing time sources</title>
152  <para>
153   On some newer Linux systems, it's possible to change the clock source used
154   to collect timing data at any time.  A second example shows the slowdown
155   possible from switching to the slower acpi_pm time source, on the same
156   system used for the fast results above:
157
158<screen>
159# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
160tsc hpet acpi_pm
161# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
162# pg_test_timing
163Per loop time including overhead: 722.92 ns
164Histogram of timing durations:
165  < us   % of total      count
166     1     27.84870    1155682
167     2     72.05956    2990371
168     4      0.07810       3241
169     8      0.01357        563
170    16      0.00007          3
171</screen>
172  </para>
173
174  <para>
175   In this configuration, the sample <command>EXPLAIN ANALYZE</command> above
176   takes 115.9 ms.  That's 1061 ns of timing overhead, again a small multiple
177   of what's measured directly by this utility.  That much timing overhead
178   means the actual query itself is only taking a tiny fraction of the
179   accounted for time, most of it is being consumed in overhead instead.  In
180   this configuration, any <command>EXPLAIN ANALYZE</command> totals involving
181   many timed operations would be inflated significantly by timing overhead.
182  </para>
183
184  <para>
185   FreeBSD also allows changing the time source on the fly, and it logs
186   information about the timer selected during boot:
187
188<screen>
189# dmesg | grep "Timecounter"
190Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
191Timecounter "i8254" frequency 1193182 Hz quality 0
192Timecounters tick every 10.000 msec
193Timecounter "TSC" frequency 2531787134 Hz quality 800
194# sysctl kern.timecounter.hardware=TSC
195kern.timecounter.hardware: ACPI-fast -> TSC
196</screen>
197  </para>
198
199  <para>
200   Other systems may only allow setting the time source on boot.  On older
201   Linux systems the "clock" kernel setting is the only way to make this sort
202   of change.  And even on some more recent ones, the only option you'll see
203   for a clock source is "jiffies".  Jiffies are the older Linux software clock
204   implementation, which can have good resolution when it's backed by fast
205   enough timing hardware, as in this example:
206
207<screen>
208$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
209jiffies
210$ dmesg | grep time.c
211time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
212time.c: Detected 2400.153 MHz processor.
213$ pg_test_timing
214Testing timing overhead for 3 seconds.
215Per timing duration including loop overhead: 97.75 ns
216Histogram of timing durations:
217  < us   % of total      count
218     1     90.23734   27694571
219     2      9.75277    2993204
220     4      0.00981       3010
221     8      0.00007         22
222    16      0.00000          1
223    32      0.00000          1
224</screen></para>
225
226 </refsect2>
227
228 <refsect2>
229  <title>Clock hardware and timing accuracy</title>
230
231  <para>
232   Collecting accurate timing information is normally done on computers using
233   hardware clocks with various levels of accuracy.  With some hardware the
234   operating systems can pass the system clock time almost directly to
235   programs.  A system clock can also be derived from a chip that simply
236   provides timing interrupts, periodic ticks at some known time interval.  In
237   either case, operating system kernels provide a clock source that hides
238   these details.  But the accuracy of that clock source and how quickly it can
239   return results varies based on the underlying hardware.
240  </para>
241
242  <para>
243   Inaccurate time keeping can result in system instability.  Test any change
244   to the clock source very carefully.  Operating system defaults are sometimes
245   made to favor reliability over best accuracy. And if you are using a virtual
246   machine, look into the recommended time sources compatible with it.  Virtual
247   hardware faces additional difficulties when emulating timers, and there are
248   often per operating system settings suggested by vendors.
249  </para>
250
251  <para>
252   The Time Stamp Counter (TSC) clock source is the most accurate one available
253   on current generation CPUs. It's the preferred way to track the system time
254   when it's supported by the operating system and the TSC clock is
255   reliable. There are several ways that TSC can fail to provide an accurate
256   timing source, making it unreliable. Older systems can have a TSC clock that
257   varies based on the CPU temperature, making it unusable for timing. Trying
258   to use TSC on some older multicore CPUs can give a reported time that's
259   inconsistent among multiple cores. This can result in the time going
260   backwards, a problem this program checks for.  And even the newest systems
261   can fail to provide accurate TSC timing with very aggressive power saving
262   configurations.
263  </para>
264
265  <para>
266   Newer operating systems may check for the known TSC problems and switch to a
267   slower, more stable clock source when they are seen.  If your system
268   supports TSC time but doesn't default to that, it may be disabled for a good
269   reason.  And some operating systems may not detect all the possible problems
270   correctly, or will allow using TSC even in situations where it's known to be
271   inaccurate.
272  </para>
273
274  <para>
275   The High Precision Event Timer (HPET) is the preferred timer on systems
276   where it's available and TSC is not accurate.  The timer chip itself is
277   programmable to allow up to 100 nanosecond resolution, but you may not see
278   that much accuracy in your system clock.
279  </para>
280
281  <para>
282   Advanced Configuration and Power Interface (ACPI) provides a Power
283   Management (PM) Timer, which Linux refers to as the acpi_pm.  The clock
284   derived from acpi_pm will at best provide 300 nanosecond resolution.
285  </para>
286
287  <para>
288   Timers used on older PC hardware include the 8254 Programmable Interval
289   Timer (PIT), the real-time clock (RTC), the Advanced Programmable Interrupt
290   Controller (APIC) timer, and the Cyclone timer.  These timers aim for
291   millisecond resolution.
292  </para>
293  </refsect2>
294 </refsect1>
295
296 <refsect1>
297  <title>See Also</title>
298
299  <simplelist type="inline">
300   <member><xref linkend="sql-explain"></member>
301  </simplelist>
302 </refsect1>
303</refentry>
304