1<!-- doc/src/sgml/planstats.sgml -->
2
3<chapter id="planner-stats-details">
4 <title>How the Planner Uses Statistics</title>
5
6  <para>
7   This chapter builds on the material covered in <xref
8   linkend="using-explain"/> and <xref linkend="planner-stats"/> to show some
9   additional details about how the planner uses the
10   system statistics to estimate the number of rows each part of a query might
11   return. This is a significant part of the planning process,
12   providing much of the raw material for cost calculation.
13  </para>
14
15  <para>
16   The intent of this chapter is not to document the code in detail,
17   but to present an overview of how it works.
18   This will perhaps ease the learning curve for someone who subsequently
19   wishes to read the code.
20  </para>
21
22 <sect1 id="row-estimation-examples">
23  <title>Row Estimation Examples</title>
24
25  <indexterm zone="row-estimation-examples">
26   <primary>row estimation</primary>
27   <secondary>planner</secondary>
28  </indexterm>
29
30  <para>
31   The examples shown below use tables in the <productname>PostgreSQL</productname>
32   regression test database.
33   The outputs shown are taken from version 8.3.
34   The behavior of earlier (or later) versions might vary.
35   Note also that since <command>ANALYZE</command> uses random sampling
36   while producing statistics, the results will change slightly after
37   any new <command>ANALYZE</command>.
38  </para>
39
40  <para>
41   Let's start with a very simple query:
42
43<programlisting>
44EXPLAIN SELECT * FROM tenk1;
45
46                         QUERY PLAN
47-------------------------------------------------------------
48 Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=244)
49</programlisting>
50
51   How the planner determines the cardinality of <structname>tenk1</structname>
52   is covered in <xref linkend="planner-stats"/>, but is repeated here for
53   completeness. The number of pages and rows is looked up in
54   <structname>pg_class</structname>:
55
56<programlisting>
57SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
58
59 relpages | reltuples
60----------+-----------
61      358 |     10000
62</programlisting>
63
64    These numbers are current as of the last <command>VACUUM</command> or
65    <command>ANALYZE</command> on the table.  The planner then fetches the
66    actual current number of pages in the table (this is a cheap operation,
67    not requiring a table scan).  If that is different from
68    <structfield>relpages</structfield> then
69    <structfield>reltuples</structfield> is scaled accordingly to
70    arrive at a current number-of-rows estimate.  In the example above, the value of
71    <structfield>relpages</structfield> is up-to-date so the rows estimate is
72    the same as <structfield>reltuples</structfield>.
73  </para>
74
75  <para>
76   Let's move on to an example with a range condition in its
77   <literal>WHERE</literal> clause:
78
79<programlisting>
80EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
81
82                                   QUERY PLAN
83-------------------------------------------------------------------&zwsp;-------------
84 Bitmap Heap Scan on tenk1  (cost=24.06..394.64 rows=1007 width=244)
85   Recheck Cond: (unique1 &lt; 1000)
86   -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..23.80 rows=1007 width=0)
87         Index Cond: (unique1 &lt; 1000)
88</programlisting>
89
90   The planner examines the <literal>WHERE</literal> clause condition
91   and looks up the selectivity function for the operator
92   <literal>&lt;</literal> in <structname>pg_operator</structname>.
93   This is held in the column <structfield>oprrest</structfield>,
94   and the entry in this case is <function>scalarltsel</function>.
95   The <function>scalarltsel</function> function retrieves the histogram for
96   <structfield>unique1</structfield> from
97   <structname>pg_statistic</structname>.  For manual queries it is more
98   convenient to look in the simpler <structname>pg_stats</structname>
99   view:
100
101<programlisting>
102SELECT histogram_bounds FROM pg_stats
103WHERE tablename='tenk1' AND attname='unique1';
104
105                   histogram_bounds
106------------------------------------------------------
107 {0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
108</programlisting>
109
110   Next the fraction of the histogram occupied by <quote>&lt; 1000</quote>
111   is worked out. This is the selectivity. The histogram divides the range
112   into equal frequency buckets, so all we have to do is locate the bucket
113   that our value is in and count <emphasis>part</emphasis> of it and
114   <emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
115   the second bucket (993&ndash;1997).  Assuming a linear distribution of
116   values inside each bucket, we can calculate the selectivity as:
117
118<programlisting>
119selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
120            = (1 + (1000 - 993)/(1997 - 993))/10
121            = 0.100697
122</programlisting>
123
124   that is, one whole bucket plus a linear fraction of the second, divided by
125   the number of buckets. The estimated number of rows can now be calculated as
126   the product of the selectivity and the cardinality of
127   <structname>tenk1</structname>:
128
129<programlisting>
130rows = rel_cardinality * selectivity
131     = 10000 * 0.100697
132     = 1007  (rounding off)
133</programlisting>
134  </para>
135
136  <para>
137   Next let's consider an example with an equality condition in its
138   <literal>WHERE</literal> clause:
139
140<programlisting>
141EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
142
143                        QUERY PLAN
144----------------------------------------------------------
145 Seq Scan on tenk1  (cost=0.00..483.00 rows=30 width=244)
146   Filter: (stringu1 = 'CRAAAA'::name)
147</programlisting>
148
149   Again the planner examines the <literal>WHERE</literal> clause condition
150   and looks up the selectivity function for <literal>=</literal>, which is
151   <function>eqsel</function>.  For equality estimation the histogram is
152   not useful; instead the list of <firstterm>most
153   common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the
154   selectivity. Let's have a look at the MCVs, with some additional columns
155   that will be useful later:
156
157<programlisting>
158SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
159WHERE tablename='tenk1' AND attname='stringu1';
160
161null_frac         | 0
162n_distinct        | 676
163most_common_vals  | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA}
164most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003}
165
166</programlisting>
167
168   Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is
169   merely the corresponding entry in the list of most common frequencies
170   (<acronym>MCF</acronym>s):
171
172<programlisting>
173selectivity = mcf[3]
174            = 0.003
175</programlisting>
176
177   As before, the estimated number of rows is just the product of this with the
178   cardinality of <structname>tenk1</structname>:
179
180<programlisting>
181rows = 10000 * 0.003
182     = 30
183</programlisting>
184  </para>
185
186  <para>
187   Now consider the same query, but with a constant that is not in the
188   <acronym>MCV</acronym> list:
189
190<programlisting>
191EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
192
193                        QUERY PLAN
194----------------------------------------------------------
195 Seq Scan on tenk1  (cost=0.00..483.00 rows=15 width=244)
196   Filter: (stringu1 = 'xxx'::name)
197</programlisting>
198
199   This is quite a different problem: how to estimate the selectivity when the
200   value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
201   The approach is to use the fact that the value is not in the list,
202   combined with the knowledge of the frequencies for all of the
203   <acronym>MCV</acronym>s:
204
205<programlisting>
206selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
207            = (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 +
208                    0.003 + 0.003 + 0.003 + 0.003))/(676 - 10)
209            = 0.0014559
210</programlisting>
211
212   That is, add up all the frequencies for the <acronym>MCV</acronym>s and
213   subtract them from one, then
214   divide by the number of <emphasis>other</emphasis> distinct values.
215   This amounts to assuming that the fraction of the column that is not any
216   of the MCVs is evenly distributed among all the other distinct values.
217   Notice that there are no null values so we don't have to worry about those
218   (otherwise we'd subtract the null fraction from the numerator as well).
219   The estimated number of rows is then calculated as usual:
220
221<programlisting>
222rows = 10000 * 0.0014559
223     = 15  (rounding off)
224</programlisting>
225  </para>
226
227  <para>
228   The previous example with <literal>unique1 &lt; 1000</literal> was an
229   oversimplification of what <function>scalarltsel</function> really does;
230   now that we have seen an example of the use of MCVs, we can fill in some
231   more detail.  The example was correct as far as it went, because since
232   <structfield>unique1</structfield> is a unique column it has no MCVs (obviously, no
233   value is any more common than any other value).  For a non-unique
234   column, there will normally be both a histogram and an MCV list, and
235   <emphasis>the histogram does not include the portion of the column
236   population represented by the MCVs</emphasis>.  We do things this way because
237   it allows more precise estimation.  In this situation
238   <function>scalarltsel</function> directly applies the condition (e.g.,
239   <quote>&lt; 1000</quote>) to each value of the MCV list, and adds up the
240   frequencies of the MCVs for which the condition is true.  This gives
241   an exact estimate of the selectivity within the portion of the table
242   that is MCVs.  The histogram is then used in the same way as above
243   to estimate the selectivity in the portion of the table that is not
244   MCVs, and then the two numbers are combined to estimate the overall
245   selectivity.  For example, consider
246
247<programlisting>
248EXPLAIN SELECT * FROM tenk1 WHERE stringu1 &lt; 'IAAAAA';
249
250                         QUERY PLAN
251------------------------------------------------------------
252 Seq Scan on tenk1  (cost=0.00..483.00 rows=3077 width=244)
253   Filter: (stringu1 &lt; 'IAAAAA'::name)
254</programlisting>
255
256   We already saw the MCV information for <structfield>stringu1</structfield>,
257   and here is its histogram:
258
259<programlisting>
260SELECT histogram_bounds FROM pg_stats
261WHERE tablename='tenk1' AND attname='stringu1';
262
263                                histogram_bounds
264-------------------------------------------------------------------&zwsp;-------------
265 {AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,&zwsp;XLAAAA,ZZAAAA}
266</programlisting>
267
268   Checking the MCV list, we find that the condition <literal>stringu1 &lt;
269   'IAAAAA'</literal> is satisfied by the first six entries and not the last four,
270   so the selectivity within the MCV part of the population is
271
272<programlisting>
273selectivity = sum(relevant mvfs)
274            = 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003
275            = 0.01833333
276</programlisting>
277
278   Summing all the MCFs also tells us that the total fraction of the
279   population represented by MCVs is 0.03033333, and therefore the
280   fraction represented by the histogram is 0.96966667 (again, there
281   are no nulls, else we'd have to exclude them here).  We can see
282   that the value <literal>IAAAAA</literal> falls nearly at the end of the
283   third histogram bucket.  Using some rather cheesy assumptions
284   about the frequency of different characters, the planner arrives
285   at the estimate 0.298387 for the portion of the histogram population
286   that is less than <literal>IAAAAA</literal>.  We then combine the estimates
287   for the MCV and non-MCV populations:
288
289<programlisting>
290selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction
291            = 0.01833333 + 0.298387 * 0.96966667
292            = 0.307669
293
294rows        = 10000 * 0.307669
295            = 3077  (rounding off)
296</programlisting>
297
298   In this particular example, the correction from the MCV list is fairly
299   small, because the column distribution is actually quite flat (the
300   statistics showing these particular values as being more common than
301   others are mostly due to sampling error).  In a more typical case where
302   some values are significantly more common than others, this complicated
303   process gives a useful improvement in accuracy because the selectivity
304   for the most common values is found exactly.
305  </para>
306
307  <para>
308   Now let's consider a case with more than one
309   condition in the <literal>WHERE</literal> clause:
310
311<programlisting>
312EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000 AND stringu1 = 'xxx';
313
314                                   QUERY PLAN
315-------------------------------------------------------------------&zwsp;-------------
316 Bitmap Heap Scan on tenk1  (cost=23.80..396.91 rows=1 width=244)
317   Recheck Cond: (unique1 &lt; 1000)
318   Filter: (stringu1 = 'xxx'::name)
319   -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..23.80 rows=1007 width=0)
320         Index Cond: (unique1 &lt; 1000)
321</programlisting>
322
323   The planner assumes that the two conditions are independent, so that
324   the individual selectivities of the clauses can be multiplied together:
325
326<programlisting>
327selectivity = selectivity(unique1 &lt; 1000) * selectivity(stringu1 = 'xxx')
328            = 0.100697 * 0.0014559
329            = 0.0001466
330
331rows        = 10000 * 0.0001466
332            = 1  (rounding off)
333</programlisting>
334
335   Notice that the number of rows estimated to be returned from the bitmap
336   index scan reflects only the condition used with the index; this is
337   important since it affects the cost estimate for the subsequent heap
338   fetches.
339  </para>
340
341  <para>
342   Finally we will examine a query that involves a join:
343
344<programlisting>
345EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
346WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
347
348                                      QUERY PLAN
349-------------------------------------------------------------------&zwsp;-------------------
350 Nested Loop  (cost=4.64..456.23 rows=50 width=488)
351   -&gt;  Bitmap Heap Scan on tenk1 t1  (cost=4.64..142.17 rows=50 width=244)
352         Recheck Cond: (unique1 &lt; 50)
353         -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.63 rows=50 width=0)
354               Index Cond: (unique1 &lt; 50)
355   -&gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.00..6.27 rows=1 width=244)
356         Index Cond: (unique2 = t1.unique2)
357</programlisting>
358
359   The restriction on <structname>tenk1</structname>,
360   <literal>unique1 &lt; 50</literal>,
361   is evaluated before the nested-loop join.
362   This is handled analogously to the previous range example.  This time the
363   value 50 falls into the first bucket of the
364   <structfield>unique1</structfield> histogram:
365
366<programlisting>
367selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets
368            = (0 + (50 - 0)/(993 - 0))/10
369            = 0.005035
370
371rows        = 10000 * 0.005035
372            = 50  (rounding off)
373</programlisting>
374
375   The restriction for the join is <literal>t2.unique2 = t1.unique2</literal>.
376   The operator is just
377   our familiar <literal>=</literal>, however the selectivity function is
378   obtained from the <structfield>oprjoin</structfield> column of
379   <structname>pg_operator</structname>, and is <function>eqjoinsel</function>.
380   <function>eqjoinsel</function> looks up the statistical information for both
381   <structname>tenk2</structname> and <structname>tenk1</structname>:
382
383<programlisting>
384SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
385WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
386
387tablename  | null_frac | n_distinct | most_common_vals
388-----------+-----------+------------+------------------
389 tenk1     |         0 |         -1 |
390 tenk2     |         0 |         -1 |
391</programlisting>
392
393   In this case there is no <acronym>MCV</acronym> information for
394   <structfield>unique2</structfield> because all the values appear to be
395   unique, so we use an algorithm that relies only on the number of
396   distinct values for both relations together with their null fractions:
397
398<programlisting>
399selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
400            = (1 - 0) * (1 - 0) / max(10000, 10000)
401            = 0.0001
402</programlisting>
403
404   This is, subtract the null fraction from one for each of the relations,
405   and divide by the maximum of the numbers of distinct values.
406   The number of rows
407   that the join is likely to emit is calculated as the cardinality of the
408   Cartesian product of the two inputs, multiplied by the
409   selectivity:
410
411<programlisting>
412rows = (outer_cardinality * inner_cardinality) * selectivity
413     = (50 * 10000) * 0.0001
414     = 50
415</programlisting>
416  </para>
417
418  <para>
419   Had there been MCV lists for the two columns,
420   <function>eqjoinsel</function> would have used direct comparison of the MCV
421   lists to determine the join selectivity within the part of the column
422   populations represented by the MCVs.  The estimate for the remainder of the
423   populations follows the same approach shown here.
424  </para>
425
426  <para>
427   Notice that we showed <literal>inner_cardinality</literal> as 10000, that is,
428   the unmodified size of <structname>tenk2</structname>.  It might appear from
429   inspection of the <command>EXPLAIN</command> output that the estimate of
430   join rows comes from 50 * 1, that is, the number of outer rows times
431   the estimated number of rows obtained by each inner index scan on
432   <structname>tenk2</structname>.  But this is not the case: the join relation size
433   is estimated before any particular join plan has been considered.  If
434   everything is working well then the two ways of estimating the join
435   size will produce about the same answer, but due to round-off error and
436   other factors they sometimes diverge significantly.
437  </para>
438
439  <para>
440   For those interested in further details, estimation of the size of
441   a table (before any <literal>WHERE</literal> clauses) is done in
442   <filename>src/backend/optimizer/util/plancat.c</filename>. The generic
443   logic for clause selectivities is in
444   <filename>src/backend/optimizer/path/clausesel.c</filename>.  The
445   operator-specific selectivity functions are mostly found
446   in <filename>src/backend/utils/adt/selfuncs.c</filename>.
447  </para>
448 </sect1>
449
450 <sect1 id="multivariate-statistics-examples">
451  <title>Multivariate Statistics Examples</title>
452
453  <indexterm>
454   <primary>row estimation</primary>
455   <secondary>multivariate</secondary>
456  </indexterm>
457
458  <sect2 id="functional-dependencies">
459   <title>Functional Dependencies</title>
460
461   <para>
462    Multivariate correlation can be demonstrated with a very simple data set
463    &mdash; a table with two columns, both containing the same values:
464
465<programlisting>
466CREATE TABLE t (a INT, b INT);
467INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i);
468ANALYZE t;
469</programlisting>
470
471    As explained in <xref linkend="planner-stats"/>, the planner can determine
472    cardinality of <structname>t</structname> using the number of pages and
473    rows obtained from <structname>pg_class</structname>:
474
475<programlisting>
476SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
477
478 relpages | reltuples
479----------+-----------
480       45 |     10000
481</programlisting>
482
483    The data distribution is very simple; there are only 100 distinct values
484    in each column, uniformly distributed.
485   </para>
486
487   <para>
488    The following example shows the result of estimating a <literal>WHERE</literal>
489    condition on the <structfield>a</structfield> column:
490
491<programlisting>
492EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
493                                 QUERY PLAN
494-------------------------------------------------------------------&zwsp;------------
495 Seq Scan on t  (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1)
496   Filter: (a = 1)
497   Rows Removed by Filter: 9900
498</programlisting>
499
500    The planner examines the condition and determines the selectivity
501    of this clause to be 1%.  By comparing this estimate and the actual
502    number of rows, we see that the estimate is very accurate
503    (in fact exact, as the table is very small).  Changing the
504    <literal>WHERE</literal> condition to use the <structfield>b</structfield> column, an
505    identical plan is generated.  But observe what happens if we apply the same
506    condition on both columns, combining them with <literal>AND</literal>:
507
508<programlisting>
509EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
510                                 QUERY PLAN
511-------------------------------------------------------------------&zwsp;----------
512 Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1)
513   Filter: ((a = 1) AND (b = 1))
514   Rows Removed by Filter: 9900
515</programlisting>
516
517    The planner estimates the selectivity for each condition individually,
518    arriving at the same 1% estimates as above.  Then it assumes that the
519    conditions are independent, and so it multiplies their selectivities,
520    producing a final selectivity estimate of just 0.01%.
521    This is a significant underestimate, as the actual number of rows
522    matching the conditions (100) is two orders of magnitude higher.
523   </para>
524
525   <para>
526    This problem can be fixed by creating a statistics object that
527    directs <command>ANALYZE</command> to calculate functional-dependency
528    multivariate statistics on the two columns:
529
530<programlisting>
531CREATE STATISTICS stts (dependencies) ON a, b FROM t;
532ANALYZE t;
533EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
534                                  QUERY PLAN
535-------------------------------------------------------------------&zwsp;------------
536 Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
537   Filter: ((a = 1) AND (b = 1))
538   Rows Removed by Filter: 9900
539</programlisting>
540   </para>
541  </sect2>
542
543  <sect2 id="multivariate-ndistinct-counts">
544   <title>Multivariate N-Distinct Counts</title>
545
546   <para>
547    A similar problem occurs with estimation of the cardinality of sets of
548    multiple columns, such as the number of groups that would be generated by
549    a <command>GROUP BY</command> clause.  When <command>GROUP BY</command>
550    lists a single column, the n-distinct estimate (which is visible as the
551    estimated number of rows returned by the HashAggregate node) is very
552    accurate:
553<programlisting>
554EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
555                                       QUERY PLAN
556-------------------------------------------------------------------&zwsp;----------------------
557 HashAggregate  (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1)
558   Group Key: a
559   -&gt;  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1)
560</programlisting>
561    But without multivariate statistics, the estimate for the number of
562    groups in a query with two columns in <command>GROUP BY</command>, as
563    in the following example, is off by an order of magnitude:
564<programlisting>
565EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
566                                       QUERY PLAN
567-------------------------------------------------------------------&zwsp;-------------------------
568 HashAggregate  (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1)
569   Group Key: a, b
570   -&gt;  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
571</programlisting>
572    By redefining the statistics object to include n-distinct counts for the
573    two columns, the estimate is much improved:
574<programlisting>
575DROP STATISTICS stts;
576CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
577ANALYZE t;
578EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
579                                       QUERY PLAN
580-------------------------------------------------------------------&zwsp;-------------------------
581 HashAggregate  (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1)
582   Group Key: a, b
583   -&gt;  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
584</programlisting>
585   </para>
586
587  </sect2>
588
589  <sect2 id="mcv-lists">
590   <title>MCV Lists</title>
591
592   <para>
593    As explained in <xref linkend="functional-dependencies"/>, functional
594    dependencies are very cheap and efficient type of statistics, but their
595    main limitation is their global nature (only tracking dependencies at
596    the column level, not between individual column values).
597   </para>
598
599   <para>
600    This section introduces multivariate variant of <acronym>MCV</acronym>
601    (most-common values) lists, a straightforward extension of the per-column
602    statistics described in <xref linkend="row-estimation-examples"/>.  These
603    statistics address the limitation by storing individual values, but it is
604    naturally more expensive, both in terms of building the statistics in
605    <command>ANALYZE</command>, storage and planning time.
606   </para>
607
608   <para>
609    Let's look at the query from <xref linkend="functional-dependencies"/>
610    again, but this time with a <acronym>MCV</acronym> list created on the
611    same set of columns (be sure to drop the functional dependencies, to
612    make sure the planner uses the newly created statistics).
613
614<programlisting>
615DROP STATISTICS stts;
616CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
617ANALYZE t;
618EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
619                                   QUERY PLAN
620-------------------------------------------------------------------&zwsp;------------
621 Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
622   Filter: ((a = 1) AND (b = 1))
623   Rows Removed by Filter: 9900
624</programlisting>
625
626    The estimate is as accurate as with the functional dependencies, mostly
627    thanks to the table being fairly small and having a simple distribution
628    with a low number of distinct values. Before looking at the second query,
629    which was not handled by functional dependencies particularly well,
630    let's inspect the <acronym>MCV</acronym> list a bit.
631   </para>
632
633   <para>
634    Inspecting the <acronym>MCV</acronym> list is possible using
635    <function>pg_mcv_list_items</function> set-returning function.
636
637<programlisting>
638SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid),
639                pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts2';
640 index |  values  | nulls | frequency | base_frequency
641-------+----------+-------+-----------+----------------
642     0 | {0, 0}   | {f,f} |      0.01 |         0.0001
643     1 | {1, 1}   | {f,f} |      0.01 |         0.0001
644   ...
645    49 | {49, 49} | {f,f} |      0.01 |         0.0001
646    50 | {50, 50} | {f,f} |      0.01 |         0.0001
647   ...
648    97 | {97, 97} | {f,f} |      0.01 |         0.0001
649    98 | {98, 98} | {f,f} |      0.01 |         0.0001
650    99 | {99, 99} | {f,f} |      0.01 |         0.0001
651(100 rows)
652</programlisting>
653
654    This confirms there are 100 distinct combinations in the two columns, and
655    all of them are about equally likely (1% frequency for each one).  The
656    base frequency is the frequency computed from per-column statistics, as if
657    there were no multi-column statistics. Had there been any null values in
658    either of the columns, this would be identified in the
659    <structfield>nulls</structfield> column.
660   </para>
661
662   <para>
663    When estimating the selectivity, the planner applies all the conditions
664    on items in the <acronym>MCV</acronym> list, and then sums the frequencies
665    of the matching ones. See <function>mcv_clauselist_selectivity</function>
666    in <filename>src/backend/statistics/mcv.c</filename> for details.
667   </para>
668
669   <para>
670    Compared to functional dependencies, <acronym>MCV</acronym> lists have two
671    major advantages. Firstly, the list stores actual values, making it possible
672    to decide which combinations are compatible.
673
674<programlisting>
675EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
676                                 QUERY PLAN
677-------------------------------------------------------------------&zwsp;--------
678 Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
679   Filter: ((a = 1) AND (b = 10))
680   Rows Removed by Filter: 10000
681</programlisting>
682
683    Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
684    not just equality clauses like functional dependencies. For example,
685    consider the following range query for the same table:
686
687<programlisting>
688EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
689                                QUERY PLAN
690-------------------------------------------------------------------&zwsp;--------
691 Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
692   Filter: ((a &lt;= 49) AND (b &gt; 49))
693   Rows Removed by Filter: 10000
694</programlisting>
695
696   </para>
697
698  </sect2>
699
700 </sect1>
701
702 <sect1 id="planner-stats-security">
703  <title>Planner Statistics and Security</title>
704
705  <para>
706   Access to the table <structname>pg_statistic</structname> is restricted to
707   superusers, so that ordinary users cannot learn about the contents of the
708   tables of other users from it.  Some selectivity estimation functions will
709   use a user-provided operator (either the operator appearing in the query or
710   a related operator) to analyze the stored statistics.  For example, in order
711   to determine whether a stored most common value is applicable, the
712   selectivity estimator will have to run the appropriate <literal>=</literal>
713   operator to compare the constant in the query to the stored value.
714   Thus the data in <structname>pg_statistic</structname> is potentially
715   passed to user-defined operators.  An appropriately crafted operator can
716   intentionally leak the passed operands (for example, by logging them
717   or writing them to a different table), or accidentally leak them by showing
718   their values in error messages, in either case possibly exposing data from
719   <structname>pg_statistic</structname> to a user who should not be able to
720   see it.
721  </para>
722
723  <para>
724   In order to prevent this, the following applies to all built-in selectivity
725   estimation functions.  When planning a query, in order to be able to use
726   stored statistics, the current user must either
727   have <literal>SELECT</literal> privilege on the table or the involved
728   columns, or the operator used must be <literal>LEAKPROOF</literal> (more
729   accurately, the function that the operator is based on).  If not, then the
730   selectivity estimator will behave as if no statistics are available, and
731   the planner will proceed with default or fall-back assumptions.
732  </para>
733
734  <para>
735   If a user does not have the required privilege on the table or columns,
736   then in many cases the query will ultimately receive a permission-denied
737   error, in which case this mechanism is invisible in practice.  But if the
738   user is reading from a security-barrier view, then the planner might wish
739   to check the statistics of an underlying table that is otherwise
740   inaccessible to the user.  In that case, the operator should be leak-proof
741   or the statistics will not be used.  There is no direct feedback about
742   that, except that the plan might be suboptimal.  If one suspects that this
743   is the case, one could try running the query as a more privileged user,
744   to see if a different plan results.
745  </para>
746
747  <para>
748   This restriction applies only to cases where the planner would need to
749   execute a user-defined operator on one or more values
750   from <structname>pg_statistic</structname>.  Thus the planner is permitted
751   to use generic statistical information, such as the fraction of null values
752   or the number of distinct values in a column, regardless of access
753   privileges.
754  </para>
755
756  <para>
757   Selectivity estimation functions contained in third-party extensions that
758   potentially operate on statistics with user-defined operators should follow
759   the same security rules.  Consult the PostgreSQL source code for guidance.
760  </para>
761 </sect1>
762</chapter>
763