1<!-- doc/src/sgml/rowtypes.sgml -->
2
3<sect1 id="rowtypes">
4 <title>Composite Types</title>
5
6 <indexterm>
7  <primary>composite type</primary>
8 </indexterm>
9
10 <indexterm>
11  <primary>row type</primary>
12 </indexterm>
13
14 <para>
15  A <firstterm>composite type</firstterm> represents the structure of a row or record;
16  it is essentially just a list of field names and their data types.
17  <productname>PostgreSQL</productname> allows  composite types to be
18  used in many of the same ways that simple types can be used.  For example, a
19  column of a table can be declared to be of a composite type.
20 </para>
21
22 <sect2 id="rowtypes-declaring">
23  <title>Declaration of Composite Types</title>
24
25 <para>
26  Here are two simple examples of defining composite types:
27<programlisting>
28CREATE TYPE complex AS (
29    r       double precision,
30    i       double precision
31);
32
33CREATE TYPE inventory_item AS (
34    name            text,
35    supplier_id     integer,
36    price           numeric
37);
38</programlisting>
39  The syntax is comparable to <command>CREATE TABLE</command>, except that only
40  field names and types can be specified; no constraints (such as <literal>NOT
41  NULL</literal>) can presently be included.  Note that the <literal>AS</literal> keyword
42  is essential; without it, the system will think a different kind
43  of <command>CREATE TYPE</command> command is meant, and you will get odd syntax
44  errors.
45 </para>
46
47 <para>
48  Having defined the types, we can use them to create tables:
49
50<programlisting>
51CREATE TABLE on_hand (
52    item      inventory_item,
53    count     integer
54);
55
56INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
57</programlisting>
58
59  or functions:
60
61<programlisting>
62CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric
63AS 'SELECT $1.price * $2' LANGUAGE SQL;
64
65SELECT price_extension(item, 10) FROM on_hand;
66</programlisting>
67
68 </para>
69
70 <para>
71  Whenever you create a table, a composite type is also automatically
72  created, with the same name as the table, to represent the table's
73  row type.  For example, had we said:
74<programlisting>
75CREATE TABLE inventory_item (
76    name            text,
77    supplier_id     integer REFERENCES suppliers,
78    price           numeric CHECK (price &gt; 0)
79);
80</programlisting>
81  then the same <literal>inventory_item</literal> composite type shown above would
82  come into being as a
83  byproduct, and could be used just as above.  Note however an important
84  restriction of the current implementation: since no constraints are
85  associated with a composite type, the constraints shown in the table
86  definition <emphasis>do not apply</emphasis> to values of the composite type
87  outside the table.  (To work around this, create a domain over the composite
88  type, and apply the desired constraints as <literal>CHECK</literal>
89  constraints of the domain.)
90 </para>
91 </sect2>
92
93 <sect2>
94  <title>Constructing Composite Values</title>
95
96  <indexterm>
97   <primary>composite type</primary>
98   <secondary>constant</secondary>
99  </indexterm>
100
101  <para>
102   To write a composite value as a literal constant, enclose the field
103   values within parentheses and separate them by commas.  You can put double
104   quotes around any field value, and must do so if it contains commas or
105   parentheses.  (More details appear <link
106   linkend="rowtypes-io-syntax">below</link>.)  Thus, the general format of
107   a composite constant is the following:
108<synopsis>
109'( <replaceable>val1</replaceable> , <replaceable>val2</replaceable> , ... )'
110</synopsis>
111   An example is:
112<programlisting>
113'("fuzzy dice",42,1.99)'
114</programlisting>
115   which would be a valid value of the <literal>inventory_item</literal> type
116   defined above.  To make a field be NULL, write no characters at all
117   in its position in the list.  For example, this constant specifies
118   a NULL third field:
119<programlisting>
120'("fuzzy dice",42,)'
121</programlisting>
122   If you want an empty string rather than NULL, write double quotes:
123<programlisting>
124'("",42,)'
125</programlisting>
126   Here the first field is a non-NULL empty string, the third is NULL.
127  </para>
128
129  <para>
130   (These constants are actually only a special case of
131   the generic type constants discussed in <xref
132   linkend="sql-syntax-constants-generic"/>.  The constant is initially
133   treated as a string and passed to the composite-type input conversion
134   routine.  An explicit type specification might be necessary to tell
135   which type to convert the constant to.)
136  </para>
137
138 <para>
139  The <literal>ROW</literal> expression syntax can also be used to
140  construct composite values.  In most cases this is considerably
141  simpler to use than the string-literal syntax since you don't have
142  to worry about multiple layers of quoting.  We already used this
143  method above:
144<programlisting>
145ROW('fuzzy dice', 42, 1.99)
146ROW('', 42, NULL)
147</programlisting>
148  The ROW keyword is actually optional as long as you have more than one
149  field in the expression, so these can be simplified to:
150<programlisting>
151('fuzzy dice', 42, 1.99)
152('', 42, NULL)
153</programlisting>
154  The <literal>ROW</literal> expression syntax is discussed in more detail in <xref
155  linkend="sql-syntax-row-constructors"/>.
156 </para>
157 </sect2>
158
159 <sect2 id="rowtypes-accessing">
160  <title>Accessing Composite Types</title>
161
162 <para>
163  To access a field of a composite column, one writes a dot and the field
164  name, much like selecting a field from a table name.  In fact, it's so
165  much like selecting from a table name that you often have to use parentheses
166  to keep from confusing the parser.  For example, you might try to select
167  some subfields from our <literal>on_hand</literal> example table with something
168  like:
169
170<programlisting>
171SELECT item.name FROM on_hand WHERE item.price &gt; 9.99;
172</programlisting>
173
174  This will not work since the name <literal>item</literal> is taken to be a table
175  name, not a column name of <literal>on_hand</literal>, per SQL syntax rules.
176  You must write it like this:
177
178<programlisting>
179SELECT (item).name FROM on_hand WHERE (item).price &gt; 9.99;
180</programlisting>
181
182  or if you need to use the table name as well (for instance in a multitable
183  query), like this:
184
185<programlisting>
186SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price &gt; 9.99;
187</programlisting>
188
189  Now the parenthesized object is correctly interpreted as a reference to
190  the <literal>item</literal> column, and then the subfield can be selected from it.
191 </para>
192
193 <para>
194  Similar syntactic issues apply whenever you select a field from a composite
195  value.  For instance, to select just one field from the result of a function
196  that returns a composite value, you'd need to write something like:
197
198<programlisting>
199SELECT (my_func(...)).field FROM ...
200</programlisting>
201
202  Without the extra parentheses, this will generate a syntax error.
203 </para>
204
205 <para>
206  The special field name <literal>*</literal> means <quote>all fields</quote>, as
207  further explained in <xref linkend="rowtypes-usage"/>.
208 </para>
209 </sect2>
210
211 <sect2>
212  <title>Modifying Composite Types</title>
213
214 <para>
215  Here are some examples of the proper syntax for inserting and updating
216  composite columns.
217  First, inserting or updating a whole column:
218
219<programlisting>
220INSERT INTO mytab (complex_col) VALUES((1.1,2.2));
221
222UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;
223</programlisting>
224
225  The first example omits <literal>ROW</literal>, the second uses it; we
226  could have done it either way.
227 </para>
228
229 <para>
230  We can update an individual subfield of a composite column:
231
232<programlisting>
233UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;
234</programlisting>
235
236  Notice here that we don't need to (and indeed cannot)
237  put parentheses around the column name appearing just after
238  <literal>SET</literal>, but we do need parentheses when referencing the same
239  column in the expression to the right of the equal sign.
240 </para>
241
242 <para>
243  And we can specify subfields as targets for <command>INSERT</command>, too:
244
245<programlisting>
246INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
247</programlisting>
248
249  Had we not supplied values for all the subfields of the column, the
250  remaining subfields would have been filled with null values.
251 </para>
252 </sect2>
253
254 <sect2 id="rowtypes-usage">
255  <title>Using Composite Types in Queries</title>
256
257  <para>
258   There are various special syntax rules and behaviors associated with
259   composite types in queries.  These rules provide useful shortcuts,
260   but can be confusing if you don't know the logic behind them.
261  </para>
262
263  <para>
264   In <productname>PostgreSQL</productname>, a reference to a table name (or alias)
265   in a query is effectively a reference to the composite value of the
266   table's current row.  For example, if we had a table
267   <structname>inventory_item</structname> as shown
268   <link linkend="rowtypes-declaring">above</link>, we could write:
269<programlisting>
270SELECT c FROM inventory_item c;
271</programlisting>
272   This query produces a single composite-valued column, so we might get
273   output like:
274<programlisting>
275           c
276------------------------
277 ("fuzzy dice",42,1.99)
278(1 row)
279</programlisting>
280   Note however that simple names are matched to column names before table
281   names, so this example works only because there is no column
282   named <structfield>c</structfield> in the query's tables.
283  </para>
284
285  <para>
286   The ordinary qualified-column-name
287   syntax <replaceable>table_name</replaceable><literal>.</literal><replaceable>column_name</replaceable>
288   can be understood as applying <link linkend="field-selection">field
289   selection</link> to the composite value of the table's current row.
290   (For efficiency reasons, it's not actually implemented that way.)
291  </para>
292
293  <para>
294   When we write
295<programlisting>
296SELECT c.* FROM inventory_item c;
297</programlisting>
298   then, according to the SQL standard, we should get the contents of the
299   table expanded into separate columns:
300<programlisting>
301    name    | supplier_id | price
302------------+-------------+-------
303 fuzzy dice |          42 |  1.99
304(1 row)
305</programlisting>
306   as if the query were
307<programlisting>
308SELECT c.name, c.supplier_id, c.price FROM inventory_item c;
309</programlisting>
310   <productname>PostgreSQL</productname> will apply this expansion behavior to
311   any composite-valued expression, although as shown <link
312   linkend="rowtypes-accessing">above</link>, you need to write parentheses
313   around the value that <literal>.*</literal> is applied to whenever it's not a
314   simple table name.  For example, if <function>myfunc()</function> is a function
315   returning a composite type with columns <structfield>a</structfield>,
316   <structfield>b</structfield>, and <structfield>c</structfield>, then these two queries have the
317   same result:
318<programlisting>
319SELECT (myfunc(x)).* FROM some_table;
320SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
321</programlisting>
322  </para>
323
324  <tip>
325   <para>
326    <productname>PostgreSQL</productname> handles column expansion by
327    actually transforming the first form into the second.  So, in this
328    example, <function>myfunc()</function> would get invoked three times per row
329    with either syntax.  If it's an expensive function you may wish to
330    avoid that, which you can do with a query like:
331<programlisting>
332SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;
333</programlisting>
334    Placing the function in
335    a <literal>LATERAL</literal> <literal>FROM</literal> item keeps it from
336    being invoked more than once per row.  <literal>m.*</literal> is still
337    expanded into <literal>m.a, m.b, m.c</literal>, but now those variables
338    are just references to the output of the <literal>FROM</literal> item.
339    (The <literal>LATERAL</literal> keyword is optional here, but we show it
340    to clarify that the function is getting <structfield>x</structfield>
341    from <structname>some_table</structname>.)
342   </para>
343  </tip>
344
345  <para>
346   The <replaceable>composite_value</replaceable><literal>.*</literal> syntax results in
347   column expansion of this kind when it appears at the top level of
348   a <link linkend="queries-select-lists"><command>SELECT</command> output
349   list</link>, a <link linkend="dml-returning"><literal>RETURNING</literal>
350   list</link> in <command>INSERT</command>/<command>UPDATE</command>/<command>DELETE</command>,
351   a <link linkend="queries-values"><literal>VALUES</literal> clause</link>, or
352   a <link linkend="sql-syntax-row-constructors">row constructor</link>.
353   In all other contexts (including when nested inside one of those
354   constructs), attaching <literal>.*</literal> to a composite value does not
355   change the value, since it means <quote>all columns</quote> and so the
356   same composite value is produced again.  For example,
357   if <function>somefunc()</function> accepts a composite-valued argument,
358   these queries are the same:
359
360<programlisting>
361SELECT somefunc(c.*) FROM inventory_item c;
362SELECT somefunc(c) FROM inventory_item c;
363</programlisting>
364
365   In both cases, the current row of <structname>inventory_item</structname> is
366   passed to the function as a single composite-valued argument.
367   Even though <literal>.*</literal> does nothing in such cases, using it is good
368   style, since it makes clear that a composite value is intended.  In
369   particular, the parser will consider <literal>c</literal> in <literal>c.*</literal> to
370   refer to a table name or alias, not to a column name, so that there is
371   no ambiguity; whereas without <literal>.*</literal>, it is not clear
372   whether <literal>c</literal> means a table name or a column name, and in fact
373   the column-name interpretation will be preferred if there is a column
374   named <literal>c</literal>.
375  </para>
376
377  <para>
378   Another example demonstrating these concepts is that all these queries
379   mean the same thing:
380<programlisting>
381SELECT * FROM inventory_item c ORDER BY c;
382SELECT * FROM inventory_item c ORDER BY c.*;
383SELECT * FROM inventory_item c ORDER BY ROW(c.*);
384</programlisting>
385   All of these <literal>ORDER BY</literal> clauses specify the row's composite
386   value, resulting in sorting the rows according to the rules described
387   in <xref linkend="composite-type-comparison"/>.  However,
388   if <structname>inventory_item</structname> contained a column
389   named <structfield>c</structfield>, the first case would be different from the
390   others, as it would mean to sort by that column only.  Given the column
391   names previously shown, these queries are also equivalent to those above:
392<programlisting>
393SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
394SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);
395</programlisting>
396   (The last case uses a row constructor with the key word <literal>ROW</literal>
397   omitted.)
398  </para>
399
400  <para>
401   Another special syntactical behavior associated with composite values is
402   that we can use <firstterm>functional notation</firstterm> for extracting a field
403   of a composite value.  The simple way to explain this is that
404   the notations <literal><replaceable>field</replaceable>(<replaceable>table</replaceable>)</literal>
405   and <literal><replaceable>table</replaceable>.<replaceable>field</replaceable></literal>
406   are interchangeable.  For example, these queries are equivalent:
407
408<programlisting>
409SELECT c.name FROM inventory_item c WHERE c.price &gt; 1000;
410SELECT name(c) FROM inventory_item c WHERE price(c) &gt; 1000;
411</programlisting>
412
413   Moreover, if we have a function that accepts a single argument of a
414   composite type, we can call it with either notation.  These queries are
415   all equivalent:
416
417<programlisting>
418SELECT somefunc(c) FROM inventory_item c;
419SELECT somefunc(c.*) FROM inventory_item c;
420SELECT c.somefunc FROM inventory_item c;
421</programlisting>
422  </para>
423
424  <para>
425   This equivalence between functional notation and field notation
426   makes it possible to use functions on composite types to implement
427   <quote>computed fields</quote>.
428   <indexterm>
429    <primary>computed field</primary>
430   </indexterm>
431   <indexterm>
432    <primary>field</primary>
433    <secondary>computed</secondary>
434   </indexterm>
435   An application using the last query above wouldn't need to be directly
436   aware that <literal>somefunc</literal> isn't a real column of the table.
437  </para>
438
439  <tip>
440   <para>
441    Because of this behavior, it's unwise to give a function that takes a
442    single composite-type argument the same name as any of the fields of
443    that composite type.  If there is ambiguity, the field-name
444    interpretation will be chosen if field-name syntax is used, while the
445    function will be chosen if function-call syntax is used.  However,
446    <productname>PostgreSQL</productname> versions before 11 always chose the
447    field-name interpretation, unless the syntax of the call required it to
448    be a function call.  One way to force the function interpretation in
449    older versions is to schema-qualify the function name, that is, write
450    <literal><replaceable>schema</replaceable>.<replaceable>func</replaceable>(<replaceable>compositevalue</replaceable>)</literal>.
451   </para>
452  </tip>
453 </sect2>
454
455 <sect2 id="rowtypes-io-syntax">
456  <title>Composite Type Input and Output Syntax</title>
457
458  <para>
459   The external text representation of a composite value consists of items that
460   are interpreted according to the I/O conversion rules for the individual
461   field types, plus decoration that indicates the composite structure.
462   The decoration consists of parentheses (<literal>(</literal> and <literal>)</literal>)
463   around the whole value, plus commas (<literal>,</literal>) between adjacent
464   items.  Whitespace outside the parentheses is ignored, but within the
465   parentheses it is considered part of the field value, and might or might not be
466   significant depending on the input conversion rules for the field data type.
467   For example, in:
468<programlisting>
469'(  42)'
470</programlisting>
471   the whitespace will be ignored if the field type is integer, but not if
472   it is text.
473  </para>
474
475  <para>
476   As shown previously, when writing a composite value you can write double
477   quotes around any individual field value.
478   You <emphasis>must</emphasis> do so if the field value would otherwise
479   confuse the composite-value parser.  In particular, fields containing
480   parentheses, commas, double quotes, or backslashes must be double-quoted.
481   To put a double quote or backslash in a quoted composite field value,
482   precede it with a backslash.  (Also, a pair of double quotes within a
483   double-quoted field value is taken to represent a double quote character,
484   analogously to the rules for single quotes in SQL literal strings.)
485   Alternatively, you can avoid quoting and use backslash-escaping to
486   protect all data characters
487   that would otherwise be taken as composite syntax.
488  </para>
489
490  <para>
491   A completely empty field value (no characters at all between the commas
492   or parentheses) represents a NULL.  To write a value that is an empty
493   string rather than NULL, write <literal>""</literal>.
494  </para>
495
496  <para>
497   The composite output routine will put double quotes around field values
498   if they are empty strings or contain parentheses, commas,
499   double quotes, backslashes, or white space.  (Doing so for white space
500   is not essential, but aids legibility.)  Double quotes and backslashes
501   embedded in field values will be doubled.
502  </para>
503
504 <note>
505  <para>
506   Remember that what you write in an SQL command will first be interpreted
507   as a string literal, and then as a composite.  This doubles the number of
508   backslashes you need (assuming escape string syntax is used).
509   For example, to insert a <type>text</type> field
510   containing a double quote and a backslash in a composite
511   value, you'd need to write:
512<programlisting>
513INSERT ... VALUES ('("\"\\")');
514</programlisting>
515   The string-literal processor removes one level of backslashes, so that
516   what arrives at the composite-value parser looks like
517   <literal>("\"\\")</literal>.  In turn, the string
518   fed to the <type>text</type> data type's input routine
519   becomes <literal>"\</literal>.  (If we were working
520   with a data type whose input routine also treated backslashes specially,
521   <type>bytea</type> for example, we might need as many as eight backslashes
522   in the command to get one backslash into the stored composite field.)
523   Dollar quoting (see <xref linkend="sql-syntax-dollar-quoting"/>) can be
524   used to avoid the need to double backslashes.
525  </para>
526 </note>
527
528 <tip>
529  <para>
530   The <literal>ROW</literal> constructor syntax is usually easier to work with
531   than the composite-literal syntax when writing composite values in SQL
532   commands.
533   In <literal>ROW</literal>, individual field values are written the same way
534   they would be written when not members of a composite.
535  </para>
536 </tip>
537 </sect2>
538
539</sect1>
540