1<!-- doc/src/sgml/planstats.sgml --> 2 3<chapter id="planner-stats-details"> 4 <title>How the Planner Uses Statistics</title> 5 6 <para> 7 This chapter builds on the material covered in <xref 8 linkend="using-explain"/> and <xref linkend="planner-stats"/> to show some 9 additional details about how the planner uses the 10 system statistics to estimate the number of rows each part of a query might 11 return. This is a significant part of the planning process, 12 providing much of the raw material for cost calculation. 13 </para> 14 15 <para> 16 The intent of this chapter is not to document the code in detail, 17 but to present an overview of how it works. 18 This will perhaps ease the learning curve for someone who subsequently 19 wishes to read the code. 20 </para> 21 22 <sect1 id="row-estimation-examples"> 23 <title>Row Estimation Examples</title> 24 25 <indexterm zone="row-estimation-examples"> 26 <primary>row estimation</primary> 27 <secondary>planner</secondary> 28 </indexterm> 29 30 <para> 31 The examples shown below use tables in the <productname>PostgreSQL</productname> 32 regression test database. 33 The outputs shown are taken from version 8.3. 34 The behavior of earlier (or later) versions might vary. 35 Note also that since <command>ANALYZE</command> uses random sampling 36 while producing statistics, the results will change slightly after 37 any new <command>ANALYZE</command>. 38 </para> 39 40 <para> 41 Let's start with a very simple query: 42 43<programlisting> 44EXPLAIN SELECT * FROM tenk1; 45 46 QUERY PLAN 47------------------------------------------------------------- 48 Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) 49</programlisting> 50 51 How the planner determines the cardinality of <structname>tenk1</structname> 52 is covered in <xref linkend="planner-stats"/>, but is repeated here for 53 completeness. The number of pages and rows is looked up in 54 <structname>pg_class</structname>: 55 56<programlisting> 57SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; 58 59 relpages | reltuples 60----------+----------- 61 358 | 10000 62</programlisting> 63 64 These numbers are current as of the last <command>VACUUM</command> or 65 <command>ANALYZE</command> on the table. The planner then fetches the 66 actual current number of pages in the table (this is a cheap operation, 67 not requiring a table scan). If that is different from 68 <structfield>relpages</structfield> then 69 <structfield>reltuples</structfield> is scaled accordingly to 70 arrive at a current number-of-rows estimate. In the example above, the value of 71 <structfield>relpages</structfield> is up-to-date so the rows estimate is 72 the same as <structfield>reltuples</structfield>. 73 </para> 74 75 <para> 76 Let's move on to an example with a range condition in its 77 <literal>WHERE</literal> clause: 78 79<programlisting> 80EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000; 81 82 QUERY PLAN 83-------------------------------------------------------------------&zwsp;------------- 84 Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244) 85 Recheck Cond: (unique1 < 1000) 86 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) 87 Index Cond: (unique1 < 1000) 88</programlisting> 89 90 The planner examines the <literal>WHERE</literal> clause condition 91 and looks up the selectivity function for the operator 92 <literal><</literal> in <structname>pg_operator</structname>. 93 This is held in the column <structfield>oprrest</structfield>, 94 and the entry in this case is <function>scalarltsel</function>. 95 The <function>scalarltsel</function> function retrieves the histogram for 96 <structfield>unique1</structfield> from 97 <structname>pg_statistic</structname>. For manual queries it is more 98 convenient to look in the simpler <structname>pg_stats</structname> 99 view: 100 101<programlisting> 102SELECT histogram_bounds FROM pg_stats 103WHERE tablename='tenk1' AND attname='unique1'; 104 105 histogram_bounds 106------------------------------------------------------ 107 {0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995} 108</programlisting> 109 110 Next the fraction of the histogram occupied by <quote>< 1000</quote> 111 is worked out. This is the selectivity. The histogram divides the range 112 into equal frequency buckets, so all we have to do is locate the bucket 113 that our value is in and count <emphasis>part</emphasis> of it and 114 <emphasis>all</emphasis> of the ones before. The value 1000 is clearly in 115 the second bucket (993–1997). Assuming a linear distribution of 116 values inside each bucket, we can calculate the selectivity as: 117 118<programlisting> 119selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets 120 = (1 + (1000 - 993)/(1997 - 993))/10 121 = 0.100697 122</programlisting> 123 124 that is, one whole bucket plus a linear fraction of the second, divided by 125 the number of buckets. The estimated number of rows can now be calculated as 126 the product of the selectivity and the cardinality of 127 <structname>tenk1</structname>: 128 129<programlisting> 130rows = rel_cardinality * selectivity 131 = 10000 * 0.100697 132 = 1007 (rounding off) 133</programlisting> 134 </para> 135 136 <para> 137 Next let's consider an example with an equality condition in its 138 <literal>WHERE</literal> clause: 139 140<programlisting> 141EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA'; 142 143 QUERY PLAN 144---------------------------------------------------------- 145 Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244) 146 Filter: (stringu1 = 'CRAAAA'::name) 147</programlisting> 148 149 Again the planner examines the <literal>WHERE</literal> clause condition 150 and looks up the selectivity function for <literal>=</literal>, which is 151 <function>eqsel</function>. For equality estimation the histogram is 152 not useful; instead the list of <firstterm>most 153 common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the 154 selectivity. Let's have a look at the MCVs, with some additional columns 155 that will be useful later: 156 157<programlisting> 158SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats 159WHERE tablename='tenk1' AND attname='stringu1'; 160 161null_frac | 0 162n_distinct | 676 163most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA} 164most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003} 165 166</programlisting> 167 168 Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is 169 merely the corresponding entry in the list of most common frequencies 170 (<acronym>MCF</acronym>s): 171 172<programlisting> 173selectivity = mcf[3] 174 = 0.003 175</programlisting> 176 177 As before, the estimated number of rows is just the product of this with the 178 cardinality of <structname>tenk1</structname>: 179 180<programlisting> 181rows = 10000 * 0.003 182 = 30 183</programlisting> 184 </para> 185 186 <para> 187 Now consider the same query, but with a constant that is not in the 188 <acronym>MCV</acronym> list: 189 190<programlisting> 191EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx'; 192 193 QUERY PLAN 194---------------------------------------------------------- 195 Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244) 196 Filter: (stringu1 = 'xxx'::name) 197</programlisting> 198 199 This is quite a different problem: how to estimate the selectivity when the 200 value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list. 201 The approach is to use the fact that the value is not in the list, 202 combined with the knowledge of the frequencies for all of the 203 <acronym>MCV</acronym>s: 204 205<programlisting> 206selectivity = (1 - sum(mvf))/(num_distinct - num_mcv) 207 = (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 + 208 0.003 + 0.003 + 0.003 + 0.003))/(676 - 10) 209 = 0.0014559 210</programlisting> 211 212 That is, add up all the frequencies for the <acronym>MCV</acronym>s and 213 subtract them from one, then 214 divide by the number of <emphasis>other</emphasis> distinct values. 215 This amounts to assuming that the fraction of the column that is not any 216 of the MCVs is evenly distributed among all the other distinct values. 217 Notice that there are no null values so we don't have to worry about those 218 (otherwise we'd subtract the null fraction from the numerator as well). 219 The estimated number of rows is then calculated as usual: 220 221<programlisting> 222rows = 10000 * 0.0014559 223 = 15 (rounding off) 224</programlisting> 225 </para> 226 227 <para> 228 The previous example with <literal>unique1 < 1000</literal> was an 229 oversimplification of what <function>scalarltsel</function> really does; 230 now that we have seen an example of the use of MCVs, we can fill in some 231 more detail. The example was correct as far as it went, because since 232 <structfield>unique1</structfield> is a unique column it has no MCVs (obviously, no 233 value is any more common than any other value). For a non-unique 234 column, there will normally be both a histogram and an MCV list, and 235 <emphasis>the histogram does not include the portion of the column 236 population represented by the MCVs</emphasis>. We do things this way because 237 it allows more precise estimation. In this situation 238 <function>scalarltsel</function> directly applies the condition (e.g., 239 <quote>< 1000</quote>) to each value of the MCV list, and adds up the 240 frequencies of the MCVs for which the condition is true. This gives 241 an exact estimate of the selectivity within the portion of the table 242 that is MCVs. The histogram is then used in the same way as above 243 to estimate the selectivity in the portion of the table that is not 244 MCVs, and then the two numbers are combined to estimate the overall 245 selectivity. For example, consider 246 247<programlisting> 248EXPLAIN SELECT * FROM tenk1 WHERE stringu1 < 'IAAAAA'; 249 250 QUERY PLAN 251------------------------------------------------------------ 252 Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244) 253 Filter: (stringu1 < 'IAAAAA'::name) 254</programlisting> 255 256 We already saw the MCV information for <structfield>stringu1</structfield>, 257 and here is its histogram: 258 259<programlisting> 260SELECT histogram_bounds FROM pg_stats 261WHERE tablename='tenk1' AND attname='stringu1'; 262 263 histogram_bounds 264-------------------------------------------------------------------&zwsp;------------- 265 {AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,&zwsp;XLAAAA,ZZAAAA} 266</programlisting> 267 268 Checking the MCV list, we find that the condition <literal>stringu1 < 269 'IAAAAA'</literal> is satisfied by the first six entries and not the last four, 270 so the selectivity within the MCV part of the population is 271 272<programlisting> 273selectivity = sum(relevant mvfs) 274 = 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 275 = 0.01833333 276</programlisting> 277 278 Summing all the MCFs also tells us that the total fraction of the 279 population represented by MCVs is 0.03033333, and therefore the 280 fraction represented by the histogram is 0.96966667 (again, there 281 are no nulls, else we'd have to exclude them here). We can see 282 that the value <literal>IAAAAA</literal> falls nearly at the end of the 283 third histogram bucket. Using some rather cheesy assumptions 284 about the frequency of different characters, the planner arrives 285 at the estimate 0.298387 for the portion of the histogram population 286 that is less than <literal>IAAAAA</literal>. We then combine the estimates 287 for the MCV and non-MCV populations: 288 289<programlisting> 290selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction 291 = 0.01833333 + 0.298387 * 0.96966667 292 = 0.307669 293 294rows = 10000 * 0.307669 295 = 3077 (rounding off) 296</programlisting> 297 298 In this particular example, the correction from the MCV list is fairly 299 small, because the column distribution is actually quite flat (the 300 statistics showing these particular values as being more common than 301 others are mostly due to sampling error). In a more typical case where 302 some values are significantly more common than others, this complicated 303 process gives a useful improvement in accuracy because the selectivity 304 for the most common values is found exactly. 305 </para> 306 307 <para> 308 Now let's consider a case with more than one 309 condition in the <literal>WHERE</literal> clause: 310 311<programlisting> 312EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx'; 313 314 QUERY PLAN 315-------------------------------------------------------------------&zwsp;------------- 316 Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244) 317 Recheck Cond: (unique1 < 1000) 318 Filter: (stringu1 = 'xxx'::name) 319 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) 320 Index Cond: (unique1 < 1000) 321</programlisting> 322 323 The planner assumes that the two conditions are independent, so that 324 the individual selectivities of the clauses can be multiplied together: 325 326<programlisting> 327selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx') 328 = 0.100697 * 0.0014559 329 = 0.0001466 330 331rows = 10000 * 0.0001466 332 = 1 (rounding off) 333</programlisting> 334 335 Notice that the number of rows estimated to be returned from the bitmap 336 index scan reflects only the condition used with the index; this is 337 important since it affects the cost estimate for the subsequent heap 338 fetches. 339 </para> 340 341 <para> 342 Finally we will examine a query that involves a join: 343 344<programlisting> 345EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 346WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2; 347 348 QUERY PLAN 349-------------------------------------------------------------------&zwsp;------------------- 350 Nested Loop (cost=4.64..456.23 rows=50 width=488) 351 -> Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244) 352 Recheck Cond: (unique1 < 50) 353 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0) 354 Index Cond: (unique1 < 50) 355 -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244) 356 Index Cond: (unique2 = t1.unique2) 357</programlisting> 358 359 The restriction on <structname>tenk1</structname>, 360 <literal>unique1 < 50</literal>, 361 is evaluated before the nested-loop join. 362 This is handled analogously to the previous range example. This time the 363 value 50 falls into the first bucket of the 364 <structfield>unique1</structfield> histogram: 365 366<programlisting> 367selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets 368 = (0 + (50 - 0)/(993 - 0))/10 369 = 0.005035 370 371rows = 10000 * 0.005035 372 = 50 (rounding off) 373</programlisting> 374 375 The restriction for the join is <literal>t2.unique2 = t1.unique2</literal>. 376 The operator is just 377 our familiar <literal>=</literal>, however the selectivity function is 378 obtained from the <structfield>oprjoin</structfield> column of 379 <structname>pg_operator</structname>, and is <function>eqjoinsel</function>. 380 <function>eqjoinsel</function> looks up the statistical information for both 381 <structname>tenk2</structname> and <structname>tenk1</structname>: 382 383<programlisting> 384SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats 385WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2'; 386 387tablename | null_frac | n_distinct | most_common_vals 388-----------+-----------+------------+------------------ 389 tenk1 | 0 | -1 | 390 tenk2 | 0 | -1 | 391</programlisting> 392 393 In this case there is no <acronym>MCV</acronym> information for 394 <structfield>unique2</structfield> because all the values appear to be 395 unique, so we use an algorithm that relies only on the number of 396 distinct values for both relations together with their null fractions: 397 398<programlisting> 399selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2) 400 = (1 - 0) * (1 - 0) / max(10000, 10000) 401 = 0.0001 402</programlisting> 403 404 This is, subtract the null fraction from one for each of the relations, 405 and divide by the maximum of the numbers of distinct values. 406 The number of rows 407 that the join is likely to emit is calculated as the cardinality of the 408 Cartesian product of the two inputs, multiplied by the 409 selectivity: 410 411<programlisting> 412rows = (outer_cardinality * inner_cardinality) * selectivity 413 = (50 * 10000) * 0.0001 414 = 50 415</programlisting> 416 </para> 417 418 <para> 419 Had there been MCV lists for the two columns, 420 <function>eqjoinsel</function> would have used direct comparison of the MCV 421 lists to determine the join selectivity within the part of the column 422 populations represented by the MCVs. The estimate for the remainder of the 423 populations follows the same approach shown here. 424 </para> 425 426 <para> 427 Notice that we showed <literal>inner_cardinality</literal> as 10000, that is, 428 the unmodified size of <structname>tenk2</structname>. It might appear from 429 inspection of the <command>EXPLAIN</command> output that the estimate of 430 join rows comes from 50 * 1, that is, the number of outer rows times 431 the estimated number of rows obtained by each inner index scan on 432 <structname>tenk2</structname>. But this is not the case: the join relation size 433 is estimated before any particular join plan has been considered. If 434 everything is working well then the two ways of estimating the join 435 size will produce about the same answer, but due to round-off error and 436 other factors they sometimes diverge significantly. 437 </para> 438 439 <para> 440 For those interested in further details, estimation of the size of 441 a table (before any <literal>WHERE</literal> clauses) is done in 442 <filename>src/backend/optimizer/util/plancat.c</filename>. The generic 443 logic for clause selectivities is in 444 <filename>src/backend/optimizer/path/clausesel.c</filename>. The 445 operator-specific selectivity functions are mostly found 446 in <filename>src/backend/utils/adt/selfuncs.c</filename>. 447 </para> 448 </sect1> 449 450 <sect1 id="multivariate-statistics-examples"> 451 <title>Multivariate Statistics Examples</title> 452 453 <indexterm> 454 <primary>row estimation</primary> 455 <secondary>multivariate</secondary> 456 </indexterm> 457 458 <sect2 id="functional-dependencies"> 459 <title>Functional Dependencies</title> 460 461 <para> 462 Multivariate correlation can be demonstrated with a very simple data set 463 — a table with two columns, both containing the same values: 464 465<programlisting> 466CREATE TABLE t (a INT, b INT); 467INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i); 468ANALYZE t; 469</programlisting> 470 471 As explained in <xref linkend="planner-stats"/>, the planner can determine 472 cardinality of <structname>t</structname> using the number of pages and 473 rows obtained from <structname>pg_class</structname>: 474 475<programlisting> 476SELECT relpages, reltuples FROM pg_class WHERE relname = 't'; 477 478 relpages | reltuples 479----------+----------- 480 45 | 10000 481</programlisting> 482 483 The data distribution is very simple; there are only 100 distinct values 484 in each column, uniformly distributed. 485 </para> 486 487 <para> 488 The following example shows the result of estimating a <literal>WHERE</literal> 489 condition on the <structfield>a</structfield> column: 490 491<programlisting> 492EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1; 493 QUERY PLAN 494-------------------------------------------------------------------&zwsp;------------ 495 Seq Scan on t (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1) 496 Filter: (a = 1) 497 Rows Removed by Filter: 9900 498</programlisting> 499 500 The planner examines the condition and determines the selectivity 501 of this clause to be 1%. By comparing this estimate and the actual 502 number of rows, we see that the estimate is very accurate 503 (in fact exact, as the table is very small). Changing the 504 <literal>WHERE</literal> condition to use the <structfield>b</structfield> column, an 505 identical plan is generated. But observe what happens if we apply the same 506 condition on both columns, combining them with <literal>AND</literal>: 507 508<programlisting> 509EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; 510 QUERY PLAN 511-------------------------------------------------------------------&zwsp;---------- 512 Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1) 513 Filter: ((a = 1) AND (b = 1)) 514 Rows Removed by Filter: 9900 515</programlisting> 516 517 The planner estimates the selectivity for each condition individually, 518 arriving at the same 1% estimates as above. Then it assumes that the 519 conditions are independent, and so it multiplies their selectivities, 520 producing a final selectivity estimate of just 0.01%. 521 This is a significant underestimate, as the actual number of rows 522 matching the conditions (100) is two orders of magnitude higher. 523 </para> 524 525 <para> 526 This problem can be fixed by creating a statistics object that 527 directs <command>ANALYZE</command> to calculate functional-dependency 528 multivariate statistics on the two columns: 529 530<programlisting> 531CREATE STATISTICS stts (dependencies) ON a, b FROM t; 532ANALYZE t; 533EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; 534 QUERY PLAN 535-------------------------------------------------------------------&zwsp;------------ 536 Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1) 537 Filter: ((a = 1) AND (b = 1)) 538 Rows Removed by Filter: 9900 539</programlisting> 540 </para> 541 </sect2> 542 543 <sect2 id="multivariate-ndistinct-counts"> 544 <title>Multivariate N-Distinct Counts</title> 545 546 <para> 547 A similar problem occurs with estimation of the cardinality of sets of 548 multiple columns, such as the number of groups that would be generated by 549 a <command>GROUP BY</command> clause. When <command>GROUP BY</command> 550 lists a single column, the n-distinct estimate (which is visible as the 551 estimated number of rows returned by the HashAggregate node) is very 552 accurate: 553<programlisting> 554EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a; 555 QUERY PLAN 556-------------------------------------------------------------------&zwsp;---------------------- 557 HashAggregate (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1) 558 Group Key: a 559 -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1) 560</programlisting> 561 But without multivariate statistics, the estimate for the number of 562 groups in a query with two columns in <command>GROUP BY</command>, as 563 in the following example, is off by an order of magnitude: 564<programlisting> 565EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; 566 QUERY PLAN 567-------------------------------------------------------------------&zwsp;------------------------- 568 HashAggregate (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1) 569 Group Key: a, b 570 -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1) 571</programlisting> 572 By redefining the statistics object to include n-distinct counts for the 573 two columns, the estimate is much improved: 574<programlisting> 575DROP STATISTICS stts; 576CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t; 577ANALYZE t; 578EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; 579 QUERY PLAN 580-------------------------------------------------------------------&zwsp;------------------------- 581 HashAggregate (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1) 582 Group Key: a, b 583 -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1) 584</programlisting> 585 </para> 586 587 </sect2> 588 589 <sect2 id="mcv-lists"> 590 <title>MCV Lists</title> 591 592 <para> 593 As explained in <xref linkend="functional-dependencies"/>, functional 594 dependencies are very cheap and efficient type of statistics, but their 595 main limitation is their global nature (only tracking dependencies at 596 the column level, not between individual column values). 597 </para> 598 599 <para> 600 This section introduces multivariate variant of <acronym>MCV</acronym> 601 (most-common values) lists, a straightforward extension of the per-column 602 statistics described in <xref linkend="row-estimation-examples"/>. These 603 statistics address the limitation by storing individual values, but it is 604 naturally more expensive, both in terms of building the statistics in 605 <command>ANALYZE</command>, storage and planning time. 606 </para> 607 608 <para> 609 Let's look at the query from <xref linkend="functional-dependencies"/> 610 again, but this time with a <acronym>MCV</acronym> list created on the 611 same set of columns (be sure to drop the functional dependencies, to 612 make sure the planner uses the newly created statistics). 613 614<programlisting> 615DROP STATISTICS stts; 616CREATE STATISTICS stts2 (mcv) ON a, b FROM t; 617ANALYZE t; 618EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; 619 QUERY PLAN 620-------------------------------------------------------------------&zwsp;------------ 621 Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1) 622 Filter: ((a = 1) AND (b = 1)) 623 Rows Removed by Filter: 9900 624</programlisting> 625 626 The estimate is as accurate as with the functional dependencies, mostly 627 thanks to the table being fairly small and having a simple distribution 628 with a low number of distinct values. Before looking at the second query, 629 which was not handled by functional dependencies particularly well, 630 let's inspect the <acronym>MCV</acronym> list a bit. 631 </para> 632 633 <para> 634 Inspecting the <acronym>MCV</acronym> list is possible using 635 <function>pg_mcv_list_items</function> set-returning function. 636 637<programlisting> 638SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid), 639 pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts2'; 640 index | values | nulls | frequency | base_frequency 641-------+----------+-------+-----------+---------------- 642 0 | {0, 0} | {f,f} | 0.01 | 0.0001 643 1 | {1, 1} | {f,f} | 0.01 | 0.0001 644 ... 645 49 | {49, 49} | {f,f} | 0.01 | 0.0001 646 50 | {50, 50} | {f,f} | 0.01 | 0.0001 647 ... 648 97 | {97, 97} | {f,f} | 0.01 | 0.0001 649 98 | {98, 98} | {f,f} | 0.01 | 0.0001 650 99 | {99, 99} | {f,f} | 0.01 | 0.0001 651(100 rows) 652</programlisting> 653 654 This confirms there are 100 distinct combinations in the two columns, and 655 all of them are about equally likely (1% frequency for each one). The 656 base frequency is the frequency computed from per-column statistics, as if 657 there were no multi-column statistics. Had there been any null values in 658 either of the columns, this would be identified in the 659 <structfield>nulls</structfield> column. 660 </para> 661 662 <para> 663 When estimating the selectivity, the planner applies all the conditions 664 on items in the <acronym>MCV</acronym> list, and then sums the frequencies 665 of the matching ones. See <function>mcv_clauselist_selectivity</function> 666 in <filename>src/backend/statistics/mcv.c</filename> for details. 667 </para> 668 669 <para> 670 Compared to functional dependencies, <acronym>MCV</acronym> lists have two 671 major advantages. Firstly, the list stores actual values, making it possible 672 to decide which combinations are compatible. 673 674<programlisting> 675EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10; 676 QUERY PLAN 677-------------------------------------------------------------------&zwsp;-------- 678 Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1) 679 Filter: ((a = 1) AND (b = 10)) 680 Rows Removed by Filter: 10000 681</programlisting> 682 683 Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types, 684 not just equality clauses like functional dependencies. For example, 685 consider the following range query for the same table: 686 687<programlisting> 688EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a <= 49 AND b > 49; 689 QUERY PLAN 690-------------------------------------------------------------------&zwsp;-------- 691 Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1) 692 Filter: ((a <= 49) AND (b > 49)) 693 Rows Removed by Filter: 10000 694</programlisting> 695 696 </para> 697 698 </sect2> 699 700 </sect1> 701 702 <sect1 id="planner-stats-security"> 703 <title>Planner Statistics and Security</title> 704 705 <para> 706 Access to the table <structname>pg_statistic</structname> is restricted to 707 superusers, so that ordinary users cannot learn about the contents of the 708 tables of other users from it. Some selectivity estimation functions will 709 use a user-provided operator (either the operator appearing in the query or 710 a related operator) to analyze the stored statistics. For example, in order 711 to determine whether a stored most common value is applicable, the 712 selectivity estimator will have to run the appropriate <literal>=</literal> 713 operator to compare the constant in the query to the stored value. 714 Thus the data in <structname>pg_statistic</structname> is potentially 715 passed to user-defined operators. An appropriately crafted operator can 716 intentionally leak the passed operands (for example, by logging them 717 or writing them to a different table), or accidentally leak them by showing 718 their values in error messages, in either case possibly exposing data from 719 <structname>pg_statistic</structname> to a user who should not be able to 720 see it. 721 </para> 722 723 <para> 724 In order to prevent this, the following applies to all built-in selectivity 725 estimation functions. When planning a query, in order to be able to use 726 stored statistics, the current user must either 727 have <literal>SELECT</literal> privilege on the table or the involved 728 columns, or the operator used must be <literal>LEAKPROOF</literal> (more 729 accurately, the function that the operator is based on). If not, then the 730 selectivity estimator will behave as if no statistics are available, and 731 the planner will proceed with default or fall-back assumptions. 732 </para> 733 734 <para> 735 If a user does not have the required privilege on the table or columns, 736 then in many cases the query will ultimately receive a permission-denied 737 error, in which case this mechanism is invisible in practice. But if the 738 user is reading from a security-barrier view, then the planner might wish 739 to check the statistics of an underlying table that is otherwise 740 inaccessible to the user. In that case, the operator should be leak-proof 741 or the statistics will not be used. There is no direct feedback about 742 that, except that the plan might be suboptimal. If one suspects that this 743 is the case, one could try running the query as a more privileged user, 744 to see if a different plan results. 745 </para> 746 747 <para> 748 This restriction applies only to cases where the planner would need to 749 execute a user-defined operator on one or more values 750 from <structname>pg_statistic</structname>. Thus the planner is permitted 751 to use generic statistical information, such as the fraction of null values 752 or the number of distinct values in a column, regardless of access 753 privileges. 754 </para> 755 756 <para> 757 Selectivity estimation functions contained in third-party extensions that 758 potentially operate on statistics with user-defined operators should follow 759 the same security rules. Consult the PostgreSQL source code for guidance. 760 </para> 761 </sect1> 762</chapter> 763