1<!-- 2doc/src/sgml/ref/create_statistics.sgml 3PostgreSQL documentation 4--> 5 6<refentry id="sql-createstatistics"> 7 <indexterm zone="sql-createstatistics"> 8 <primary>CREATE STATISTICS</primary> 9 </indexterm> 10 11 <refmeta> 12 <refentrytitle>CREATE STATISTICS</refentrytitle> 13 <manvolnum>7</manvolnum> 14 <refmiscinfo>SQL - Language Statements</refmiscinfo> 15 </refmeta> 16 17 <refnamediv> 18 <refname>CREATE STATISTICS</refname> 19 <refpurpose>define extended statistics</refpurpose> 20 </refnamediv> 21 22 <refsynopsisdiv> 23<synopsis> 24CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_name</replaceable> 25 [ ( <replaceable class="parameter">statistics_kind</replaceable> [, ... ] ) ] 26 ON <replaceable class="parameter">column_name</replaceable>, <replaceable class="parameter">column_name</replaceable> [, ...] 27 FROM <replaceable class="parameter">table_name</replaceable> 28</synopsis> 29 30 </refsynopsisdiv> 31 32 <refsect1 id="sql-createstatistics-description"> 33 <title>Description</title> 34 35 <para> 36 <command>CREATE STATISTICS</command> will create a new extended statistics 37 object tracking data about the specified table, foreign table or 38 materialized view. The statistics object will be created in the current 39 database and will be owned by the user issuing the command. 40 </para> 41 42 <para> 43 If a schema name is given (for example, <literal>CREATE STATISTICS 44 myschema.mystat ...</literal>) then the statistics object is created in the 45 specified schema. Otherwise it is created in the current schema. 46 The name of the statistics object must be distinct from the name of any 47 other statistics object in the same schema. 48 </para> 49 </refsect1> 50 51 <refsect1> 52 <title>Parameters</title> 53 54 <variablelist> 55 56 <varlistentry> 57 <term><literal>IF NOT EXISTS</literal></term> 58 <listitem> 59 <para> 60 Do not throw an error if a statistics object with the same name already 61 exists. A notice is issued in this case. Note that only the name of 62 the statistics object is considered here, not the details of its 63 definition. 64 </para> 65 </listitem> 66 </varlistentry> 67 68 <varlistentry> 69 <term><replaceable class="parameter">statistics_name</replaceable></term> 70 <listitem> 71 <para> 72 The name (optionally schema-qualified) of the statistics object to be 73 created. 74 </para> 75 </listitem> 76 </varlistentry> 77 78 <varlistentry> 79 <term><replaceable class="parameter">statistics_kind</replaceable></term> 80 <listitem> 81 <para> 82 A statistics kind to be computed in this statistics object. 83 Currently supported kinds are 84 <literal>ndistinct</literal>, which enables n-distinct statistics, 85 <literal>dependencies</literal>, which enables functional 86 dependency statistics, and <literal>mcv</literal> which enables 87 most-common values lists. 88 If this clause is omitted, all supported statistics kinds are 89 included in the statistics object. 90 For more information, see <xref linkend="planner-stats-extended"/> 91 and <xref linkend="multivariate-statistics-examples"/>. 92 </para> 93 </listitem> 94 </varlistentry> 95 96 <varlistentry> 97 <term><replaceable class="parameter">column_name</replaceable></term> 98 <listitem> 99 <para> 100 The name of a table column to be covered by the computed statistics. 101 At least two column names must be given; the order of the column names 102 is insignificant. 103 </para> 104 </listitem> 105 </varlistentry> 106 107 <varlistentry> 108 <term><replaceable class="parameter">table_name</replaceable></term> 109 <listitem> 110 <para> 111 The name (optionally schema-qualified) of the table containing the 112 column(s) the statistics are computed on. 113 </para> 114 </listitem> 115 </varlistentry> 116 117 </variablelist> 118 </refsect1> 119 120 <refsect1> 121 <title>Notes</title> 122 123 <para> 124 You must be the owner of a table to create a statistics object 125 reading it. Once created, however, the ownership of the statistics 126 object is independent of the underlying table(s). 127 </para> 128 </refsect1> 129 130 <refsect1 id="sql-createstatistics-examples"> 131 <title>Examples</title> 132 133 <para> 134 Create table <structname>t1</structname> with two functionally dependent columns, i.e., 135 knowledge of a value in the first column is sufficient for determining the 136 value in the other column. Then functional dependency statistics are built 137 on those columns: 138 139<programlisting> 140CREATE TABLE t1 ( 141 a int, 142 b int 143); 144 145INSERT INTO t1 SELECT i/100, i/500 146 FROM generate_series(1,1000000) s(i); 147 148ANALYZE t1; 149 150-- the number of matching rows will be drastically underestimated: 151EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0); 152 153CREATE STATISTICS s1 (dependencies) ON a, b FROM t1; 154 155ANALYZE t1; 156 157-- now the row count estimate is more accurate: 158EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0); 159</programlisting> 160 161 Without functional-dependency statistics, the planner would assume 162 that the two <literal>WHERE</literal> conditions are independent, and would 163 multiply their selectivities together to arrive at a much-too-small 164 row count estimate. 165 With such statistics, the planner recognizes that the <literal>WHERE</literal> 166 conditions are redundant and does not underestimate the row count. 167 </para> 168 169 <para> 170 Create table <structname>t2</structname> with two perfectly correlated columns 171 (containing identical data), and a MCV list on those columns: 172 173<programlisting> 174CREATE TABLE t2 ( 175 a int, 176 b int 177); 178 179INSERT INTO t2 SELECT mod(i,100), mod(i,100) 180 FROM generate_series(1,1000000) s(i); 181 182CREATE STATISTICS s2 (mcv) ON a, b FROM t2; 183 184ANALYZE t2; 185 186-- valid combination (found in MCV) 187EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1); 188 189-- invalid combination (not found in MCV) 190EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2); 191</programlisting> 192 193 The MCV list gives the planner more detailed information about the 194 specific values that commonly appear in the table, as well as an upper 195 bound on the selectivities of combinations of values that do not appear 196 in the table, allowing it to generate better estimates in both cases. 197 </para> 198 199 </refsect1> 200 201 <refsect1> 202 <title>Compatibility</title> 203 204 <para> 205 There is no <command>CREATE STATISTICS</command> command in the SQL standard. 206 </para> 207 </refsect1> 208 209 <refsect1> 210 <title>See Also</title> 211 212 <simplelist type="inline"> 213 <member><xref linkend="sql-alterstatistics"/></member> 214 <member><xref linkend="sql-dropstatistics"/></member> 215 </simplelist> 216 </refsect1> 217</refentry> 218