1<!-- doc/src/sgml/datatype.sgml -->
2
3 <chapter id="datatype">
4  <title>Data Types</title>
5
6  <indexterm zone="datatype">
7   <primary>data type</primary>
8  </indexterm>
9
10  <indexterm>
11   <primary>type</primary>
12   <see>data type</see>
13  </indexterm>
14
15  <para>
16   <productname>PostgreSQL</productname> has a rich set of native data
17   types available to users.  Users can add new types to
18   <productname>PostgreSQL</productname> using the <xref
19   linkend="sql-createtype"/> command.
20  </para>
21
22  <para>
23   <xref linkend="datatype-table"/> shows all the built-in general-purpose data
24   types. Most of the alternative names listed in the
25   <quote>Aliases</quote> column are the names used internally by
26   <productname>PostgreSQL</productname> for historical reasons.  In
27   addition, some internally used or deprecated types are available,
28   but are not listed here.
29  </para>
30
31   <table id="datatype-table">
32    <title>Data Types</title>
33    <tgroup cols="3">
34     <colspec colname="col1" colwidth="2*"/>
35     <colspec colname="col2" colwidth="1*"/>
36     <colspec colname="col3" colwidth="2*"/>
37     <thead>
38      <row>
39       <entry>Name</entry>
40       <entry>Aliases</entry>
41       <entry>Description</entry>
42      </row>
43     </thead>
44
45     <tbody>
46      <row>
47       <entry><type>bigint</type></entry>
48       <entry><type>int8</type></entry>
49       <entry>signed eight-byte integer</entry>
50      </row>
51
52      <row>
53       <entry><type>bigserial</type></entry>
54       <entry><type>serial8</type></entry>
55       <entry>autoincrementing eight-byte integer</entry>
56      </row>
57
58      <row>
59       <entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry>
60       <entry></entry>
61       <entry>fixed-length bit string</entry>
62      </row>
63
64      <row>
65       <entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry>
66       <entry><type>varbit [ (<replaceable>n</replaceable>) ]</type></entry>
67       <entry>variable-length bit string</entry>
68      </row>
69
70      <row>
71       <entry><type>boolean</type></entry>
72       <entry><type>bool</type></entry>
73       <entry>logical Boolean (true/false)</entry>
74      </row>
75
76      <row>
77       <entry><type>box</type></entry>
78       <entry></entry>
79       <entry>rectangular box on a plane</entry>
80      </row>
81
82      <row>
83       <entry><type>bytea</type></entry>
84       <entry></entry>
85       <entry>binary data (<quote>byte array</quote>)</entry>
86      </row>
87
88      <row>
89       <entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry>
90       <entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry>
91       <entry>fixed-length character string</entry>
92      </row>
93
94      <row>
95       <entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry>
96       <entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry>
97       <entry>variable-length character string</entry>
98      </row>
99
100      <row>
101       <entry><type>cidr</type></entry>
102       <entry></entry>
103       <entry>IPv4 or IPv6 network address</entry>
104      </row>
105
106      <row>
107       <entry><type>circle</type></entry>
108       <entry></entry>
109       <entry>circle on a plane</entry>
110      </row>
111
112      <row>
113       <entry><type>date</type></entry>
114       <entry></entry>
115       <entry>calendar date (year, month, day)</entry>
116      </row>
117
118      <row>
119       <entry><type>double precision</type></entry>
120       <entry><type>float8</type></entry>
121       <entry>double precision floating-point number (8 bytes)</entry>
122      </row>
123
124      <row>
125       <entry><type>inet</type></entry>
126       <entry></entry>
127       <entry>IPv4 or IPv6 host address</entry>
128      </row>
129
130      <row>
131       <entry><type>integer</type></entry>
132       <entry><type>int</type>, <type>int4</type></entry>
133       <entry>signed four-byte integer</entry>
134      </row>
135
136      <row>
137       <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
138       <entry></entry>
139       <entry>time span</entry>
140      </row>
141
142      <row>
143       <entry><type>json</type></entry>
144       <entry></entry>
145       <entry>textual JSON data</entry>
146      </row>
147
148      <row>
149       <entry><type>jsonb</type></entry>
150       <entry></entry>
151       <entry>binary JSON data, decomposed</entry>
152      </row>
153
154      <row>
155       <entry><type>line</type></entry>
156       <entry></entry>
157       <entry>infinite line on a plane</entry>
158      </row>
159
160      <row>
161       <entry><type>lseg</type></entry>
162       <entry></entry>
163       <entry>line segment on a plane</entry>
164      </row>
165
166      <row>
167       <entry><type>macaddr</type></entry>
168       <entry></entry>
169       <entry>MAC (Media Access Control) address</entry>
170      </row>
171
172      <row>
173       <entry><type>macaddr8</type></entry>
174       <entry></entry>
175       <entry>MAC (Media Access Control) address (EUI-64 format)</entry>
176      </row>
177
178      <row>
179       <entry><type>money</type></entry>
180       <entry></entry>
181       <entry>currency amount</entry>
182      </row>
183
184      <row>
185       <entry><type>numeric [ (<replaceable>p</replaceable>,
186         <replaceable>s</replaceable>) ]</type></entry>
187       <entry><type>decimal [ (<replaceable>p</replaceable>,
188         <replaceable>s</replaceable>) ]</type></entry>
189       <entry>exact numeric of selectable precision</entry>
190      </row>
191
192      <row>
193       <entry><type>path</type></entry>
194       <entry></entry>
195       <entry>geometric path on a plane</entry>
196      </row>
197
198      <row>
199       <entry><type>pg_lsn</type></entry>
200       <entry></entry>
201       <entry><productname>PostgreSQL</productname> Log Sequence Number</entry>
202      </row>
203
204      <row>
205       <entry><type>pg_snapshot</type></entry>
206       <entry></entry>
207       <entry>user-level transaction ID snapshot</entry>
208      </row>
209
210      <row>
211       <entry><type>point</type></entry>
212       <entry></entry>
213       <entry>geometric point on a plane</entry>
214      </row>
215
216      <row>
217       <entry><type>polygon</type></entry>
218       <entry></entry>
219       <entry>closed geometric path on a plane</entry>
220      </row>
221
222      <row>
223       <entry><type>real</type></entry>
224       <entry><type>float4</type></entry>
225       <entry>single precision floating-point number (4 bytes)</entry>
226      </row>
227
228      <row>
229       <entry><type>smallint</type></entry>
230       <entry><type>int2</type></entry>
231       <entry>signed two-byte integer</entry>
232      </row>
233
234      <row>
235       <entry><type>smallserial</type></entry>
236       <entry><type>serial2</type></entry>
237       <entry>autoincrementing two-byte integer</entry>
238      </row>
239
240      <row>
241       <entry><type>serial</type></entry>
242       <entry><type>serial4</type></entry>
243       <entry>autoincrementing four-byte integer</entry>
244      </row>
245
246      <row>
247       <entry><type>text</type></entry>
248       <entry></entry>
249       <entry>variable-length character string</entry>
250      </row>
251
252      <row>
253       <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
254       <entry></entry>
255       <entry>time of day (no time zone)</entry>
256      </row>
257
258      <row>
259       <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
260       <entry><type>timetz</type></entry>
261       <entry>time of day, including time zone</entry>
262      </row>
263
264      <row>
265       <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
266       <entry></entry>
267       <entry>date and time (no time zone)</entry>
268      </row>
269
270      <row>
271       <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
272       <entry><type>timestamptz</type></entry>
273       <entry>date and time, including time zone</entry>
274      </row>
275
276      <row>
277       <entry><type>tsquery</type></entry>
278       <entry></entry>
279       <entry>text search query</entry>
280      </row>
281
282      <row>
283       <entry><type>tsvector</type></entry>
284       <entry></entry>
285       <entry>text search document</entry>
286      </row>
287
288      <row>
289       <entry><type>txid_snapshot</type></entry>
290       <entry></entry>
291       <entry>user-level transaction ID snapshot (deprecated; see <type>pg_snapshot</type>)</entry>
292      </row>
293
294      <row>
295       <entry><type>uuid</type></entry>
296       <entry></entry>
297       <entry>universally unique identifier</entry>
298      </row>
299
300      <row>
301       <entry><type>xml</type></entry>
302       <entry></entry>
303       <entry>XML data</entry>
304      </row>
305     </tbody>
306    </tgroup>
307   </table>
308
309  <note>
310   <title>Compatibility</title>
311   <para>
312    The following types (or spellings thereof) are specified by
313    <acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit
314    varying</type>, <type>boolean</type>, <type>char</type>,
315    <type>character varying</type>, <type>character</type>,
316    <type>varchar</type>, <type>date</type>, <type>double
317    precision</type>, <type>integer</type>, <type>interval</type>,
318    <type>numeric</type>, <type>decimal</type>, <type>real</type>,
319    <type>smallint</type>, <type>time</type> (with or without time zone),
320    <type>timestamp</type> (with or without time zone),
321    <type>xml</type>.
322   </para>
323  </note>
324
325  <para>
326   Each data type has an external representation determined by its input
327   and output functions.  Many of the built-in types have
328   obvious external formats.  However, several types are either unique
329   to <productname>PostgreSQL</productname>, such as geometric
330   paths, or have several possible formats, such as the date
331   and time types.
332   Some of the input and output functions are not invertible, i.e.,
333   the result of an output function might lose accuracy when compared to
334   the original input.
335  </para>
336
337  <sect1 id="datatype-numeric">
338   <title>Numeric Types</title>
339
340   <indexterm zone="datatype-numeric">
341    <primary>data type</primary>
342    <secondary>numeric</secondary>
343   </indexterm>
344
345   <para>
346    Numeric types consist of two-, four-, and eight-byte integers,
347    four- and eight-byte floating-point numbers, and selectable-precision
348    decimals.  <xref linkend="datatype-numeric-table"/> lists the
349    available types.
350   </para>
351
352    <table id="datatype-numeric-table">
353     <title>Numeric Types</title>
354     <tgroup cols="4">
355      <colspec colname="col1" colwidth="2*"/>
356      <colspec colname="col2" colwidth="1*"/>
357      <colspec colname="col3" colwidth="2*"/>
358      <colspec colname="col4" colwidth="2*"/>
359      <thead>
360       <row>
361        <entry>Name</entry>
362        <entry>Storage Size</entry>
363        <entry>Description</entry>
364        <entry>Range</entry>
365       </row>
366      </thead>
367
368      <tbody>
369       <row>
370        <entry><type>smallint</type></entry>
371        <entry>2 bytes</entry>
372        <entry>small-range integer</entry>
373        <entry>-32768 to +32767</entry>
374       </row>
375       <row>
376        <entry><type>integer</type></entry>
377        <entry>4 bytes</entry>
378        <entry>typical choice for integer</entry>
379        <entry>-2147483648 to +2147483647</entry>
380       </row>
381       <row>
382        <entry><type>bigint</type></entry>
383        <entry>8 bytes</entry>
384        <entry>large-range integer</entry>
385        <entry>-9223372036854775808 to +9223372036854775807</entry>
386       </row>
387
388       <row>
389        <entry><type>decimal</type></entry>
390        <entry>variable</entry>
391        <entry>user-specified precision, exact</entry>
392        <entry>up to 131072 digits before the decimal point; up to 16383 digits after the decimal point</entry>
393       </row>
394       <row>
395        <entry><type>numeric</type></entry>
396        <entry>variable</entry>
397        <entry>user-specified precision, exact</entry>
398        <entry>up to 131072 digits before the decimal point; up to 16383 digits after the decimal point</entry>
399       </row>
400
401       <row>
402        <entry><type>real</type></entry>
403        <entry>4 bytes</entry>
404        <entry>variable-precision, inexact</entry>
405        <entry>6 decimal digits precision</entry>
406       </row>
407       <row>
408        <entry><type>double precision</type></entry>
409        <entry>8 bytes</entry>
410        <entry>variable-precision, inexact</entry>
411        <entry>15 decimal digits precision</entry>
412       </row>
413
414       <row>
415        <entry><type>smallserial</type></entry>
416        <entry>2 bytes</entry>
417        <entry>small autoincrementing integer</entry>
418        <entry>1 to 32767</entry>
419       </row>
420
421       <row>
422        <entry><type>serial</type></entry>
423        <entry>4 bytes</entry>
424        <entry>autoincrementing integer</entry>
425        <entry>1 to 2147483647</entry>
426       </row>
427
428       <row>
429        <entry><type>bigserial</type></entry>
430        <entry>8 bytes</entry>
431        <entry>large autoincrementing integer</entry>
432        <entry>1 to 9223372036854775807</entry>
433       </row>
434      </tbody>
435     </tgroup>
436    </table>
437
438   <para>
439    The syntax of constants for the numeric types is described in
440    <xref linkend="sql-syntax-constants"/>.  The numeric types have a
441    full set of corresponding arithmetic operators and
442    functions. Refer to <xref linkend="functions"/> for more
443    information.  The following sections describe the types in detail.
444   </para>
445
446   <sect2 id="datatype-int">
447    <title>Integer Types</title>
448
449    <indexterm zone="datatype-int">
450     <primary>integer</primary>
451    </indexterm>
452
453    <indexterm zone="datatype-int">
454     <primary>smallint</primary>
455    </indexterm>
456
457    <indexterm zone="datatype-int">
458     <primary>bigint</primary>
459    </indexterm>
460
461    <indexterm>
462     <primary>int4</primary>
463     <see>integer</see>
464    </indexterm>
465
466    <indexterm>
467     <primary>int2</primary>
468     <see>smallint</see>
469    </indexterm>
470
471    <indexterm>
472     <primary>int8</primary>
473     <see>bigint</see>
474    </indexterm>
475
476    <para>
477     The types <type>smallint</type>, <type>integer</type>, and
478     <type>bigint</type> store whole numbers, that is, numbers without
479     fractional components, of various ranges.  Attempts to store
480     values outside of the allowed range will result in an error.
481    </para>
482
483    <para>
484     The type <type>integer</type> is the common choice, as it offers
485     the best balance between range, storage size, and performance.
486     The <type>smallint</type> type is generally only used if disk
487     space is at a premium.  The <type>bigint</type> type is designed to be
488     used when the range of the <type>integer</type> type is insufficient.
489    </para>
490
491    <para>
492     <acronym>SQL</acronym> only specifies the integer types
493     <type>integer</type> (or <type>int</type>),
494     <type>smallint</type>, and <type>bigint</type>.  The
495     type names <type>int2</type>, <type>int4</type>, and
496     <type>int8</type> are extensions, which are also used by some
497     other <acronym>SQL</acronym> database systems.
498    </para>
499
500   </sect2>
501
502   <sect2 id="datatype-numeric-decimal">
503    <title>Arbitrary Precision Numbers</title>
504
505    <indexterm>
506     <primary>numeric (data type)</primary>
507    </indexterm>
508
509   <indexterm>
510    <primary>arbitrary precision numbers</primary>
511   </indexterm>
512
513    <indexterm>
514     <primary>decimal</primary>
515     <see>numeric</see>
516    </indexterm>
517
518    <para>
519     The type <type>numeric</type> can store numbers with a
520     very large number of digits. It is especially recommended for
521     storing monetary amounts and other quantities where exactness is
522     required.  Calculations with <type>numeric</type> values yield exact
523     results where possible, e.g.,  addition, subtraction, multiplication.
524     However, calculations on <type>numeric</type> values are very slow
525     compared to the integer types, or to the floating-point types
526     described in the next section.
527    </para>
528
529    <para>
530     We use the following terms below:  The
531     <firstterm>precision</firstterm> of a <type>numeric</type>
532     is the total count of significant digits in the whole number,
533     that is, the number of digits to both sides of the decimal point.
534     The <firstterm>scale</firstterm> of a <type>numeric</type> is the
535     count of decimal digits in the fractional part, to the right of the
536     decimal point.  So the number 23.5141 has a precision of 6 and a
537     scale of 4.  Integers can be considered to have a scale of zero.
538    </para>
539
540    <para>
541     Both the maximum precision and the maximum scale of a
542     <type>numeric</type> column can be
543     configured.  To declare a column of type <type>numeric</type> use
544     the syntax:
545<programlisting>
546NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
547</programlisting>
548     The precision must be positive, the scale zero or positive.
549     Alternatively:
550<programlisting>
551NUMERIC(<replaceable>precision</replaceable>)
552</programlisting>
553     selects a scale of 0.  Specifying:
554<programlisting>
555NUMERIC
556</programlisting>
557     without any precision or scale creates a column in which numeric
558     values of any precision and scale can be stored, up to the
559     implementation limit on precision.  A column of this kind will
560     not coerce input values to any particular scale, whereas
561     <type>numeric</type> columns with a declared scale will coerce
562     input values to that scale.  (The <acronym>SQL</acronym> standard
563     requires a default scale of 0, i.e., coercion to integer
564     precision.  We find this a bit useless.  If you're concerned
565     about portability, always specify the precision and scale
566     explicitly.)
567    </para>
568
569    <note>
570     <para>
571      The maximum allowed precision when explicitly specified in the
572      type declaration is 1000; <type>NUMERIC</type> without a specified
573      precision is subject to the limits described in <xref
574      linkend="datatype-numeric-table"/>.
575     </para>
576    </note>
577
578    <para>
579     If the scale of a value to be stored is greater than the declared
580     scale of the column, the system will round the value to the specified
581     number of fractional digits.  Then, if the number of digits to the
582     left of the decimal point exceeds the declared precision minus the
583     declared scale, an error is raised.
584    </para>
585
586    <para>
587     Numeric values are physically stored without any extra leading or
588     trailing zeroes.  Thus, the declared precision and scale of a column
589     are maximums, not fixed allocations.  (In this sense the <type>numeric</type>
590     type is more akin to <type>varchar(<replaceable>n</replaceable>)</type>
591     than to <type>char(<replaceable>n</replaceable>)</type>.)  The actual storage
592     requirement is two bytes for each group of four decimal digits,
593     plus three to eight bytes overhead.
594    </para>
595
596    <indexterm>
597     <primary>NaN</primary>
598     <see>not a number</see>
599   </indexterm>
600
601    <indexterm>
602     <primary>not a number</primary>
603     <secondary>numeric (data type)</secondary>
604    </indexterm>
605
606    <para>
607     In addition to ordinary numeric values, the <type>numeric</type>
608     type allows the special value <literal>NaN</literal>, meaning
609     <quote>not-a-number</quote>.  Any operation on <literal>NaN</literal>
610     yields another <literal>NaN</literal>.  When writing this value
611     as a constant in an SQL command, you must put quotes around it,
612     for example <literal>UPDATE table SET x = 'NaN'</literal>.  On input,
613     the string <literal>NaN</literal> is recognized in a case-insensitive manner.
614    </para>
615
616    <note>
617     <para>
618      In most implementations of the <quote>not-a-number</quote> concept,
619      <literal>NaN</literal> is not considered equal to any other numeric
620      value (including <literal>NaN</literal>).  In order to allow
621      <type>numeric</type> values to be sorted and used in tree-based
622      indexes, <productname>PostgreSQL</productname> treats <literal>NaN</literal>
623      values as equal, and greater than all non-<literal>NaN</literal>
624      values.
625     </para>
626    </note>
627
628    <para>
629     The types <type>decimal</type> and <type>numeric</type> are
630     equivalent.  Both types are part of the <acronym>SQL</acronym>
631     standard.
632    </para>
633
634    <para>
635     When rounding values, the <type>numeric</type> type rounds ties away
636     from zero, while (on most machines) the <type>real</type>
637     and <type>double precision</type> types round ties to the nearest even
638     number.  For example:
639
640<programlisting>
641SELECT x,
642  round(x::numeric) AS num_round,
643  round(x::double precision) AS dbl_round
644FROM generate_series(-3.5, 3.5, 1) as x;
645  x   | num_round | dbl_round
646------+-----------+-----------
647 -3.5 |        -4 |        -4
648 -2.5 |        -3 |        -2
649 -1.5 |        -2 |        -2
650 -0.5 |        -1 |        -0
651  0.5 |         1 |         0
652  1.5 |         2 |         2
653  2.5 |         3 |         2
654  3.5 |         4 |         4
655(8 rows)
656</programlisting>
657    </para>
658   </sect2>
659
660
661   <sect2 id="datatype-float">
662    <title>Floating-Point Types</title>
663
664    <indexterm zone="datatype-float">
665     <primary>real</primary>
666    </indexterm>
667
668    <indexterm zone="datatype-float">
669     <primary>double precision</primary>
670    </indexterm>
671
672    <indexterm>
673     <primary>float4</primary>
674     <see>real</see>
675    </indexterm>
676
677    <indexterm>
678     <primary>float8</primary>
679     <see>double precision</see>
680    </indexterm>
681
682    <indexterm zone="datatype-float">
683     <primary>floating point</primary>
684    </indexterm>
685
686    <para>
687     The data types <type>real</type> and <type>double precision</type> are
688     inexact, variable-precision numeric types. On all currently supported
689     platforms, these types are implementations of <acronym>IEEE</acronym>
690     Standard 754 for Binary Floating-Point Arithmetic (single and double
691     precision, respectively), to the extent that the underlying processor,
692     operating system, and compiler support it.
693    </para>
694
695    <para>
696     Inexact means that some values cannot be converted exactly to the
697     internal format and are stored as approximations, so that storing
698     and retrieving a value might show slight discrepancies.
699     Managing these errors and how they propagate through calculations
700     is the subject of an entire branch of mathematics and computer
701     science and will not be discussed here, except for the
702     following points:
703     <itemizedlist>
704      <listitem>
705       <para>
706        If you require exact storage and calculations (such as for
707        monetary amounts), use the <type>numeric</type> type instead.
708       </para>
709      </listitem>
710
711      <listitem>
712       <para>
713        If you want to do complicated calculations with these types
714        for anything important, especially if you rely on certain
715        behavior in boundary cases (infinity, underflow), you should
716        evaluate the implementation carefully.
717       </para>
718      </listitem>
719
720      <listitem>
721       <para>
722        Comparing two floating-point values for equality might not
723        always work as expected.
724       </para>
725      </listitem>
726     </itemizedlist>
727    </para>
728
729    <para>
730     On all currently supported platforms, the <type>real</type> type has a
731     range of around 1E-37 to 1E+37 with a precision of at least 6 decimal
732     digits. The <type>double precision</type> type has a range of around
733     1E-307 to 1E+308 with a precision of at least 15 digits. Values that are
734     too large or too small will cause an error. Rounding might take place if
735     the precision of an input number is too high. Numbers too close to zero
736     that are not representable as distinct from zero will cause an underflow
737     error.
738    </para>
739
740    <para>
741     By default, floating point values are output in text form in their
742     shortest precise decimal representation; the decimal value produced is
743     closer to the true stored binary value than to any other value
744     representable in the same binary precision. (However, the output value is
745     currently never <emphasis>exactly</emphasis> midway between two
746     representable values, in order to avoid a widespread bug where input
747     routines do not properly respect the round-to-nearest-even rule.) This value will
748     use at most 17 significant decimal digits for <type>float8</type>
749     values, and at most 9 digits for <type>float4</type> values.
750    </para>
751
752    <note>
753     <para>
754      This shortest-precise output format is much faster to generate than the
755      historical rounded format.
756     </para>
757    </note>
758
759    <para>
760     For compatibility with output generated by older versions
761     of <productname>PostgreSQL</productname>, and to allow the output
762     precision to be reduced, the <xref linkend="guc-extra-float-digits"/>
763     parameter can be used to select rounded decimal output instead. Setting a
764     value of 0 restores the previous default of rounding the value to 6
765     (for <type>float4</type>) or 15 (for <type>float8</type>)
766     significant decimal digits. Setting a negative value reduces the number
767     of digits further; for example -2 would round output to 4 or 13 digits
768     respectively.
769    </para>
770
771    <para>
772     Any value of <xref linkend="guc-extra-float-digits"/> greater than 0
773     selects the shortest-precise format.
774    </para>
775
776    <note>
777     <para>
778      Applications that wanted precise values have historically had to set
779      <xref linkend="guc-extra-float-digits"/> to 3 to obtain them. For
780      maximum compatibility between versions, they should continue to do so.
781     </para>
782    </note>
783
784    <indexterm>
785     <primary>not a number</primary>
786     <secondary>double precision</secondary>
787    </indexterm>
788
789    <para>
790     In addition to ordinary numeric values, the floating-point types
791     have several special values:
792<literallayout>
793<literal>Infinity</literal>
794<literal>-Infinity</literal>
795<literal>NaN</literal>
796</literallayout>
797     These represent the IEEE 754 special values
798     <quote>infinity</quote>, <quote>negative infinity</quote>, and
799     <quote>not-a-number</quote>, respectively. When writing these values
800     as constants in an SQL command, you must put quotes around them,
801     for example <literal>UPDATE table SET x = '-Infinity'</literal>.  On input,
802     these strings are recognized in a case-insensitive manner.
803    </para>
804
805    <note>
806     <para>
807      IEEE754 specifies that <literal>NaN</literal> should not compare equal
808      to any other floating-point value (including <literal>NaN</literal>).
809      In order to allow floating-point values to be sorted and used
810      in tree-based indexes, <productname>PostgreSQL</productname> treats
811      <literal>NaN</literal> values as equal, and greater than all
812      non-<literal>NaN</literal> values.
813     </para>
814    </note>
815
816    <para>
817     <productname>PostgreSQL</productname> also supports the SQL-standard
818     notations <type>float</type> and
819     <type>float(<replaceable>p</replaceable>)</type> for specifying
820     inexact numeric types.  Here, <replaceable>p</replaceable> specifies
821     the minimum acceptable precision in <emphasis>binary</emphasis> digits.
822     <productname>PostgreSQL</productname> accepts
823     <type>float(1)</type> to <type>float(24)</type> as selecting the
824     <type>real</type> type, while
825     <type>float(25)</type> to <type>float(53)</type> select
826     <type>double precision</type>.  Values of <replaceable>p</replaceable>
827     outside the allowed range draw an error.
828     <type>float</type> with no precision specified is taken to mean
829     <type>double precision</type>.
830    </para>
831
832   </sect2>
833
834   <sect2 id="datatype-serial">
835    <title>Serial Types</title>
836
837    <indexterm zone="datatype-serial">
838     <primary>smallserial</primary>
839    </indexterm>
840
841    <indexterm zone="datatype-serial">
842     <primary>serial</primary>
843    </indexterm>
844
845    <indexterm zone="datatype-serial">
846     <primary>bigserial</primary>
847    </indexterm>
848
849    <indexterm zone="datatype-serial">
850     <primary>serial2</primary>
851    </indexterm>
852
853    <indexterm zone="datatype-serial">
854     <primary>serial4</primary>
855    </indexterm>
856
857    <indexterm zone="datatype-serial">
858     <primary>serial8</primary>
859    </indexterm>
860
861    <indexterm>
862     <primary>auto-increment</primary>
863     <see>serial</see>
864    </indexterm>
865
866    <indexterm>
867     <primary>sequence</primary>
868     <secondary>and serial type</secondary>
869    </indexterm>
870
871    <note>
872     <para>
873      This section describes a PostgreSQL-specific way to create an
874      autoincrementing column.  Another way is to use the SQL-standard
875      identity column feature, described at <xref linkend="sql-createtable"/>.
876     </para>
877    </note>
878
879    <para>
880     The data types <type>smallserial</type>, <type>serial</type> and
881     <type>bigserial</type> are not true types, but merely
882     a notational convenience for creating unique identifier columns
883     (similar to the <literal>AUTO_INCREMENT</literal> property
884     supported by some other databases). In the current
885     implementation, specifying:
886
887<programlisting>
888CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
889    <replaceable class="parameter">colname</replaceable> SERIAL
890);
891</programlisting>
892
893     is equivalent to specifying:
894
895<programlisting>
896CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq AS integer;
897CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
898    <replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')
899);
900ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;
901</programlisting>
902
903     Thus, we have created an integer column and arranged for its default
904     values to be assigned from a sequence generator.  A <literal>NOT NULL</literal>
905     constraint is applied to ensure that a null value cannot be
906     inserted.  (In most cases you would also want to attach a
907     <literal>UNIQUE</literal> or <literal>PRIMARY KEY</literal> constraint to prevent
908     duplicate values from being inserted by accident, but this is
909     not automatic.)  Lastly, the sequence is marked as <quote>owned by</quote>
910     the column, so that it will be dropped if the column or table is dropped.
911    </para>
912
913    <note>
914      <para>
915        Because <type>smallserial</type>, <type>serial</type> and
916        <type>bigserial</type> are implemented using sequences, there may
917        be "holes" or gaps in the sequence of values which appears in the
918        column, even if no rows are ever deleted.  A value allocated
919        from the sequence is still "used up" even if a row containing that
920        value is never successfully inserted into the table column.  This
921        may happen, for example, if the inserting transaction rolls back.
922        See <literal>nextval()</literal> in <xref linkend="functions-sequence"/>
923        for details.
924      </para>
925    </note>
926
927    <para>
928     To insert the next value of the sequence into the <type>serial</type>
929     column, specify that the <type>serial</type>
930     column should be assigned its default value. This can be done
931     either by excluding the column from the list of columns in
932     the <command>INSERT</command> statement, or through the use of
933     the <literal>DEFAULT</literal> key word.
934    </para>
935
936    <para>
937     The type names <type>serial</type> and <type>serial4</type> are
938     equivalent: both create <type>integer</type> columns.  The type
939     names <type>bigserial</type> and <type>serial8</type> work
940     the same way, except that they create a <type>bigint</type>
941     column.  <type>bigserial</type> should be used if you anticipate
942     the use of more than 2<superscript>31</superscript> identifiers over the
943     lifetime of the table. The type names <type>smallserial</type> and
944     <type>serial2</type> also work the same way, except that they
945     create a <type>smallint</type> column.
946    </para>
947
948    <para>
949     The sequence created for a <type>serial</type> column is
950     automatically dropped when the owning column is dropped.
951     You can drop the sequence without dropping the column, but this
952     will force removal of the column default expression.
953    </para>
954   </sect2>
955  </sect1>
956
957  <sect1 id="datatype-money">
958   <title>Monetary Types</title>
959
960   <para>
961    The <type>money</type> type stores a currency amount with a fixed
962    fractional precision; see <xref
963    linkend="datatype-money-table"/>.  The fractional precision is
964    determined by the database's <xref linkend="guc-lc-monetary"/> setting.
965    The range shown in the table assumes there are two fractional digits.
966    Input is accepted in a variety of formats, including integer and
967    floating-point literals, as well as typical
968    currency formatting, such as <literal>'$1,000.00'</literal>.
969    Output is generally in the latter form but depends on the locale.
970   </para>
971
972    <table id="datatype-money-table">
973     <title>Monetary Types</title>
974     <tgroup cols="4">
975      <colspec colname="col1" colwidth="2*"/>
976      <colspec colname="col2" colwidth="1*"/>
977      <colspec colname="col3" colwidth="2*"/>
978      <colspec colname="col4" colwidth="2*"/>
979      <thead>
980       <row>
981        <entry>Name</entry>
982        <entry>Storage Size</entry>
983        <entry>Description</entry>
984        <entry>Range</entry>
985       </row>
986      </thead>
987      <tbody>
988       <row>
989        <entry><type>money</type></entry>
990        <entry>8 bytes</entry>
991        <entry>currency amount</entry>
992        <entry>-92233720368547758.08 to +92233720368547758.07</entry>
993       </row>
994      </tbody>
995     </tgroup>
996    </table>
997
998   <para>
999    Since the output of this data type is locale-sensitive, it might not
1000    work to load <type>money</type> data into a database that has a different
1001    setting of <varname>lc_monetary</varname>.  To avoid problems, before
1002    restoring a dump into a new database make sure <varname>lc_monetary</varname> has
1003    the same or equivalent value as in the database that was dumped.
1004   </para>
1005
1006   <para>
1007    Values of the <type>numeric</type>, <type>int</type>, and
1008    <type>bigint</type> data types can be cast to <type>money</type>.
1009    Conversion from the <type>real</type> and <type>double precision</type>
1010    data types can be done by casting to <type>numeric</type> first, for
1011    example:
1012<programlisting>
1013SELECT '12.34'::float8::numeric::money;
1014</programlisting>
1015    However, this is not recommended.  Floating point numbers should not be
1016    used to handle money due to the potential for rounding errors.
1017   </para>
1018
1019   <para>
1020    A <type>money</type> value can be cast to <type>numeric</type> without
1021    loss of precision. Conversion to other types could potentially lose
1022    precision, and must also be done in two stages:
1023<programlisting>
1024SELECT '52093.89'::money::numeric::float8;
1025</programlisting>
1026   </para>
1027
1028   <para>
1029    Division of a <type>money</type> value by an integer value is performed
1030    with truncation of the fractional part towards zero.  To get a rounded
1031    result, divide by a floating-point value, or cast the <type>money</type>
1032    value to <type>numeric</type> before dividing and back to <type>money</type>
1033    afterwards.  (The latter is preferable to avoid risking precision loss.)
1034    When a <type>money</type> value is divided by another <type>money</type>
1035    value, the result is <type>double precision</type> (i.e., a pure number,
1036    not money); the currency units cancel each other out in the division.
1037   </para>
1038  </sect1>
1039
1040
1041  <sect1 id="datatype-character">
1042   <title>Character Types</title>
1043
1044   <indexterm zone="datatype-character">
1045    <primary>character string</primary>
1046    <secondary>data types</secondary>
1047   </indexterm>
1048
1049   <indexterm>
1050    <primary>string</primary>
1051    <see>character string</see>
1052   </indexterm>
1053
1054   <indexterm zone="datatype-character">
1055    <primary>character</primary>
1056   </indexterm>
1057
1058   <indexterm zone="datatype-character">
1059    <primary>character varying</primary>
1060   </indexterm>
1061
1062   <indexterm zone="datatype-character">
1063    <primary>text</primary>
1064   </indexterm>
1065
1066   <indexterm zone="datatype-character">
1067    <primary>char</primary>
1068   </indexterm>
1069
1070   <indexterm zone="datatype-character">
1071    <primary>varchar</primary>
1072   </indexterm>
1073
1074    <table id="datatype-character-table">
1075     <title>Character Types</title>
1076     <tgroup cols="2">
1077      <thead>
1078       <row>
1079        <entry>Name</entry>
1080        <entry>Description</entry>
1081       </row>
1082      </thead>
1083      <tbody>
1084       <row>
1085        <entry><type>character varying(<replaceable>n</replaceable>)</type>, <type>varchar(<replaceable>n</replaceable>)</type></entry>
1086        <entry>variable-length with limit</entry>
1087       </row>
1088       <row>
1089        <entry><type>character(<replaceable>n</replaceable>)</type>, <type>char(<replaceable>n</replaceable>)</type></entry>
1090        <entry>fixed-length, blank padded</entry>
1091       </row>
1092       <row>
1093        <entry><type>text</type></entry>
1094        <entry>variable unlimited length</entry>
1095       </row>
1096     </tbody>
1097     </tgroup>
1098    </table>
1099
1100   <para>
1101    <xref linkend="datatype-character-table"/> shows the
1102    general-purpose character types available in
1103    <productname>PostgreSQL</productname>.
1104   </para>
1105
1106   <para>
1107    <acronym>SQL</acronym> defines two primary character types:
1108    <type>character varying(<replaceable>n</replaceable>)</type> and
1109    <type>character(<replaceable>n</replaceable>)</type>, where <replaceable>n</replaceable>
1110    is a positive integer.  Both of these types can store strings up to
1111    <replaceable>n</replaceable> characters (not bytes) in length.  An attempt to store a
1112    longer string into a column of these types will result in an
1113    error, unless the excess characters are all spaces, in which case
1114    the string will be truncated to the maximum length. (This somewhat
1115    bizarre exception is required by the <acronym>SQL</acronym>
1116    standard.) If the string to be stored is shorter than the declared
1117    length, values of type <type>character</type> will be space-padded;
1118    values of type <type>character varying</type> will simply store the
1119    shorter
1120    string.
1121   </para>
1122
1123   <para>
1124    If one explicitly casts a value to <type>character
1125    varying(<replaceable>n</replaceable>)</type> or
1126    <type>character(<replaceable>n</replaceable>)</type>, then an over-length
1127    value will be truncated to <replaceable>n</replaceable> characters without
1128    raising an error. (This too is required by the
1129    <acronym>SQL</acronym> standard.)
1130   </para>
1131
1132   <para>
1133    The notations <type>varchar(<replaceable>n</replaceable>)</type> and
1134    <type>char(<replaceable>n</replaceable>)</type> are aliases for <type>character
1135    varying(<replaceable>n</replaceable>)</type> and
1136    <type>character(<replaceable>n</replaceable>)</type>, respectively.
1137    <type>character</type> without length specifier is equivalent to
1138    <type>character(1)</type>. If <type>character varying</type> is used
1139    without length specifier, the type accepts strings of any size. The
1140    latter is a <productname>PostgreSQL</productname> extension.
1141   </para>
1142
1143   <para>
1144    In addition, <productname>PostgreSQL</productname> provides the
1145    <type>text</type> type, which stores strings of any length.
1146    Although the type <type>text</type> is not in the
1147    <acronym>SQL</acronym> standard, several other SQL database
1148    management systems have it as well.
1149   </para>
1150
1151   <para>
1152    Values of type <type>character</type> are physically padded
1153    with spaces to the specified width <replaceable>n</replaceable>, and are
1154    stored and displayed that way.  However, trailing spaces are treated as
1155    semantically insignificant and disregarded when comparing two values
1156    of type <type>character</type>.  In collations where whitespace
1157    is significant, this behavior can produce unexpected results;
1158    for example <command>SELECT 'a '::CHAR(2) collate "C" &lt;
1159    E'a\n'::CHAR(2)</command> returns true, even though <literal>C</literal>
1160    locale would consider a space to be greater than a newline.
1161    Trailing spaces are removed when converting a <type>character</type> value
1162    to one of the other string types.  Note that trailing spaces
1163    <emphasis>are</emphasis> semantically significant in
1164    <type>character varying</type> and <type>text</type> values, and
1165    when using pattern matching, that is <literal>LIKE</literal> and
1166    regular expressions.
1167   </para>
1168
1169   <para>
1170    The characters that can be stored in any of these data types are
1171    determined by the database character set, which is selected when
1172    the database is created.  Regardless of the specific character set,
1173    the character with code zero (sometimes called NUL) cannot be stored.
1174    For more information refer to <xref linkend="multibyte"/>.
1175   </para>
1176
1177   <para>
1178    The storage requirement for a short string (up to 126 bytes) is 1 byte
1179    plus the actual string, which includes the space padding in the case of
1180    <type>character</type>.  Longer strings have 4 bytes of overhead instead
1181    of 1.  Long strings are compressed by the system automatically, so
1182    the physical requirement on disk might be less. Very long values are also
1183    stored in background tables so that they do not interfere with rapid
1184    access to shorter column values. In any case, the longest
1185    possible character string that can be stored is about 1 GB. (The
1186    maximum value that will be allowed for <replaceable>n</replaceable> in the data
1187    type declaration is less than that. It wouldn't be useful to
1188    change this because with multibyte character encodings the number of
1189    characters and bytes can be quite different. If you desire to
1190    store long strings with no specific upper limit, use
1191    <type>text</type> or <type>character varying</type> without a length
1192    specifier, rather than making up an arbitrary length limit.)
1193   </para>
1194
1195   <tip>
1196    <para>
1197     There is no performance difference among these three types,
1198     apart from increased storage space when using the blank-padded
1199     type, and a few extra CPU cycles to check the length when storing into
1200     a length-constrained column.  While
1201     <type>character(<replaceable>n</replaceable>)</type> has performance
1202     advantages in some other database systems, there is no such advantage in
1203     <productname>PostgreSQL</productname>; in fact
1204     <type>character(<replaceable>n</replaceable>)</type> is usually the slowest of
1205     the three because of its additional storage costs.  In most situations
1206     <type>text</type> or <type>character varying</type> should be used
1207     instead.
1208    </para>
1209   </tip>
1210
1211   <para>
1212    Refer to <xref linkend="sql-syntax-strings"/> for information about
1213    the syntax of string literals, and to <xref linkend="functions"/>
1214    for information about available operators and functions.
1215   </para>
1216
1217   <example>
1218    <title>Using the Character Types</title>
1219
1220<programlisting>
1221CREATE TABLE test1 (a character(4));
1222INSERT INTO test1 VALUES ('ok');
1223SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char"/>
1224<computeroutput>
1225  a   | char_length
1226------+-------------
1227 ok   |           2
1228</computeroutput>
1229
1230CREATE TABLE test2 (b varchar(5));
1231INSERT INTO test2 VALUES ('ok');
1232INSERT INTO test2 VALUES ('good      ');
1233INSERT INTO test2 VALUES ('too long');
1234<computeroutput>ERROR:  value too long for type character varying(5)</computeroutput>
1235INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
1236SELECT b, char_length(b) FROM test2;
1237<computeroutput>
1238   b   | char_length
1239-------+-------------
1240 ok    |           2
1241 good  |           5
1242 too l |           5
1243</computeroutput>
1244</programlisting>
1245    <calloutlist>
1246     <callout arearefs="co.datatype-char">
1247      <para>
1248       The <function>char_length</function> function is discussed in
1249       <xref linkend="functions-string"/>.
1250      </para>
1251     </callout>
1252    </calloutlist>
1253   </example>
1254
1255   <para>
1256    There are two other fixed-length character types in
1257    <productname>PostgreSQL</productname>, shown in <xref
1258    linkend="datatype-character-special-table"/>. The <type>name</type>
1259    type exists <emphasis>only</emphasis> for the storage of identifiers
1260    in the internal system catalogs and is not intended for use by the general user. Its
1261    length is currently defined as 64 bytes (63 usable characters plus
1262    terminator) but should be referenced using the constant
1263    <symbol>NAMEDATALEN</symbol> in <literal>C</literal> source code.
1264    The length is set at compile time (and
1265    is therefore adjustable for special uses); the default maximum
1266    length might change in a future release. The type <type>"char"</type>
1267    (note the quotes) is different from <type>char(1)</type> in that it
1268    only uses one byte of storage. It is internally used in the system
1269    catalogs as a simplistic enumeration type.
1270   </para>
1271
1272    <table id="datatype-character-special-table">
1273     <title>Special Character Types</title>
1274     <tgroup cols="3">
1275      <thead>
1276       <row>
1277        <entry>Name</entry>
1278        <entry>Storage Size</entry>
1279        <entry>Description</entry>
1280       </row>
1281      </thead>
1282      <tbody>
1283       <row>
1284        <entry><type>"char"</type></entry>
1285        <entry>1 byte</entry>
1286        <entry>single-byte internal type</entry>
1287       </row>
1288       <row>
1289        <entry><type>name</type></entry>
1290        <entry>64 bytes</entry>
1291        <entry>internal type for object names</entry>
1292       </row>
1293      </tbody>
1294     </tgroup>
1295    </table>
1296
1297  </sect1>
1298
1299 <sect1 id="datatype-binary">
1300  <title>Binary Data Types</title>
1301
1302  <indexterm zone="datatype-binary">
1303   <primary>binary data</primary>
1304  </indexterm>
1305
1306  <indexterm zone="datatype-binary">
1307   <primary>bytea</primary>
1308  </indexterm>
1309
1310   <para>
1311    The <type>bytea</type> data type allows storage of binary strings;
1312    see <xref linkend="datatype-binary-table"/>.
1313   </para>
1314
1315   <table id="datatype-binary-table">
1316    <title>Binary Data Types</title>
1317    <tgroup cols="3">
1318     <colspec colname="col1" colwidth="1*"/>
1319     <colspec colname="col2" colwidth="3*"/>
1320     <colspec colname="col3" colwidth="2*"/>
1321     <thead>
1322      <row>
1323       <entry>Name</entry>
1324       <entry>Storage Size</entry>
1325       <entry>Description</entry>
1326      </row>
1327     </thead>
1328     <tbody>
1329      <row>
1330       <entry><type>bytea</type></entry>
1331       <entry>1 or 4 bytes plus the actual binary string</entry>
1332       <entry>variable-length binary string</entry>
1333      </row>
1334     </tbody>
1335    </tgroup>
1336   </table>
1337
1338   <para>
1339    A binary string is a sequence of octets (or bytes).  Binary
1340    strings are distinguished from character strings in two
1341    ways.  First, binary strings specifically allow storing
1342    octets of value zero and other <quote>non-printable</quote>
1343    octets (usually, octets outside the decimal range 32 to 126).
1344    Character strings disallow zero octets, and also disallow any
1345    other octet values and sequences of octet values that are invalid
1346    according to the database's selected character set encoding.
1347    Second, operations on binary strings process the actual bytes,
1348    whereas the processing of character strings depends on locale settings.
1349    In short, binary strings are appropriate for storing data that the
1350    programmer thinks of as <quote>raw bytes</quote>, whereas character
1351    strings are appropriate for storing text.
1352   </para>
1353
1354   <para>
1355    The <type>bytea</type> type supports two
1356    formats for input and output: <quote>hex</quote> format
1357    and <productname>PostgreSQL</productname>'s historical
1358    <quote>escape</quote> format.  Both
1359    of these are always accepted on input.  The output format depends
1360    on the configuration parameter <xref linkend="guc-bytea-output"/>;
1361    the default is hex.  (Note that the hex format was introduced in
1362    <productname>PostgreSQL</productname> 9.0; earlier versions and some
1363    tools don't understand it.)
1364   </para>
1365
1366   <para>
1367    The <acronym>SQL</acronym> standard defines a different binary
1368    string type, called <type>BLOB</type> or <type>BINARY LARGE
1369    OBJECT</type>.  The input format is different from
1370    <type>bytea</type>, but the provided functions and operators are
1371    mostly the same.
1372   </para>
1373
1374  <sect2>
1375   <title><type>bytea</type> Hex Format</title>
1376
1377   <para>
1378    The <quote>hex</quote> format encodes binary data as 2 hexadecimal digits
1379    per byte, most significant nibble first.  The entire string is
1380    preceded by the sequence <literal>\x</literal> (to distinguish it
1381    from the escape format).  In some contexts, the initial backslash may
1382    need to be escaped by doubling it
1383    (see <xref linkend="sql-syntax-strings"/>).
1384    For input, the hexadecimal digits can
1385    be either upper or lower case, and whitespace is permitted between
1386    digit pairs (but not within a digit pair nor in the starting
1387    <literal>\x</literal> sequence).
1388    The hex format is compatible with a wide
1389    range of external applications and protocols, and it tends to be
1390    faster to convert than the escape format, so its use is preferred.
1391   </para>
1392
1393   <para>
1394    Example:
1395<programlisting>
1396SELECT '\xDEADBEEF';
1397</programlisting>
1398   </para>
1399  </sect2>
1400
1401  <sect2>
1402   <title><type>bytea</type> Escape Format</title>
1403
1404   <para>
1405    The <quote>escape</quote> format is the traditional
1406    <productname>PostgreSQL</productname> format for the <type>bytea</type>
1407    type.  It
1408    takes the approach of representing a binary string as a sequence
1409    of ASCII characters, while converting those bytes that cannot be
1410    represented as an ASCII character into special escape sequences.
1411    If, from the point of view of the application, representing bytes
1412    as characters makes sense, then this representation can be
1413    convenient.  But in practice it is usually confusing because it
1414    fuzzes up the distinction between binary strings and character
1415    strings, and also the particular escape mechanism that was chosen is
1416    somewhat unwieldy.  Therefore, this format should probably be avoided
1417    for most new applications.
1418   </para>
1419
1420   <para>
1421    When entering <type>bytea</type> values in escape format,
1422    octets of certain
1423    values <emphasis>must</emphasis> be escaped, while all octet
1424    values <emphasis>can</emphasis> be escaped.  In
1425    general, to escape an octet, convert it into its three-digit
1426    octal value and precede it by a backslash.
1427    Backslash itself (octet decimal value 92) can alternatively be represented by
1428    double backslashes.
1429    <xref linkend="datatype-binary-sqlesc"/>
1430    shows the characters that must be escaped, and gives the alternative
1431    escape sequences where applicable.
1432   </para>
1433
1434   <table id="datatype-binary-sqlesc">
1435    <title><type>bytea</type> Literal Escaped Octets</title>
1436    <tgroup cols="5">
1437     <colspec colname="col1" colwidth="1*"/>
1438     <colspec colname="col2" colwidth="1*"/>
1439     <colspec colname="col3" colwidth="1*"/>
1440     <colspec colname="col4" colwidth="1.25*"/>
1441     <colspec colname="col5" colwidth="1*"/>
1442     <thead>
1443      <row>
1444       <entry>Decimal Octet Value</entry>
1445       <entry>Description</entry>
1446       <entry>Escaped Input Representation</entry>
1447       <entry>Example</entry>
1448       <entry>Hex Representation</entry>
1449      </row>
1450     </thead>
1451
1452     <tbody>
1453      <row>
1454       <entry>0</entry>
1455       <entry>zero octet</entry>
1456       <entry><literal>'\000'</literal></entry>
1457       <entry><literal>'\000'::bytea</literal></entry>
1458       <entry><literal>\x00</literal></entry>
1459      </row>
1460
1461      <row>
1462       <entry>39</entry>
1463       <entry>single quote</entry>
1464       <entry><literal>''''</literal> or <literal>'\047'</literal></entry>
1465       <entry><literal>''''::bytea</literal></entry>
1466       <entry><literal>\x27</literal></entry>
1467      </row>
1468
1469      <row>
1470       <entry>92</entry>
1471       <entry>backslash</entry>
1472       <entry><literal>'\\'</literal> or <literal>'\134'</literal></entry>
1473       <entry><literal>'\\'::bytea</literal></entry>
1474       <entry><literal>\x5c</literal></entry>
1475      </row>
1476
1477      <row>
1478       <entry>0 to 31 and 127 to 255</entry>
1479       <entry><quote>non-printable</quote> octets</entry>
1480       <entry><literal>'\<replaceable>xxx'</replaceable></literal> (octal value)</entry>
1481       <entry><literal>'\001'::bytea</literal></entry>
1482       <entry><literal>\x01</literal></entry>
1483      </row>
1484
1485     </tbody>
1486    </tgroup>
1487   </table>
1488
1489   <para>
1490    The requirement to escape <emphasis>non-printable</emphasis> octets
1491    varies depending on locale settings. In some instances you can get away
1492    with leaving them unescaped.
1493   </para>
1494
1495   <para>
1496    The reason that single quotes must be doubled, as shown
1497    in <xref linkend="datatype-binary-sqlesc"/>, is that this
1498    is true for any string literal in a SQL command.  The generic
1499    string-literal parser consumes the outermost single quotes
1500    and reduces any pair of single quotes to one data character.
1501    What the <type>bytea</type> input function sees is just one
1502    single quote, which it treats as a plain data character.
1503    However, the <type>bytea</type> input function treats
1504    backslashes as special, and the other behaviors shown in
1505    <xref linkend="datatype-binary-sqlesc"/> are implemented by
1506    that function.
1507   </para>
1508
1509   <para>
1510    In some contexts, backslashes must be doubled compared to what is
1511    shown above, because the generic string-literal parser will also
1512    reduce pairs of backslashes to one data character;
1513    see <xref linkend="sql-syntax-strings"/>.
1514   </para>
1515
1516   <para>
1517    <type>Bytea</type> octets are output in <literal>hex</literal>
1518    format by default.  If you change <xref linkend="guc-bytea-output"/>
1519    to <literal>escape</literal>,
1520    <quote>non-printable</quote> octets are converted to their
1521    equivalent three-digit octal value and preceded by one backslash.
1522    Most <quote>printable</quote> octets are output by their standard
1523    representation in the client character set, e.g.:
1524
1525<programlisting>
1526SET bytea_output = 'escape';
1527
1528SELECT 'abc \153\154\155 \052\251\124'::bytea;
1529     bytea
1530----------------
1531 abc klm *\251T
1532</programlisting>
1533
1534    The octet with decimal value 92 (backslash) is doubled in the output.
1535    Details are in <xref linkend="datatype-binary-resesc"/>.
1536   </para>
1537
1538   <table id="datatype-binary-resesc">
1539    <title><type>bytea</type> Output Escaped Octets</title>
1540    <tgroup cols="5">
1541     <colspec colname="col1" colwidth="1*"/>
1542     <colspec colname="col2" colwidth="1*"/>
1543     <colspec colname="col3" colwidth="1*"/>
1544     <colspec colname="col4" colwidth="1.25*"/>
1545     <colspec colname="col5" colwidth="1*"/>
1546     <thead>
1547      <row>
1548       <entry>Decimal Octet Value</entry>
1549       <entry>Description</entry>
1550       <entry>Escaped Output Representation</entry>
1551       <entry>Example</entry>
1552       <entry>Output Result</entry>
1553      </row>
1554     </thead>
1555
1556     <tbody>
1557
1558      <row>
1559       <entry>92</entry>
1560       <entry>backslash</entry>
1561       <entry><literal>\\</literal></entry>
1562       <entry><literal>'\134'::bytea</literal></entry>
1563       <entry><literal>\\</literal></entry>
1564      </row>
1565
1566      <row>
1567       <entry>0 to 31 and 127 to 255</entry>
1568       <entry><quote>non-printable</quote> octets</entry>
1569       <entry><literal>\<replaceable>xxx</replaceable></literal> (octal value)</entry>
1570       <entry><literal>'\001'::bytea</literal></entry>
1571       <entry><literal>\001</literal></entry>
1572      </row>
1573
1574      <row>
1575       <entry>32 to 126</entry>
1576       <entry><quote>printable</quote> octets</entry>
1577       <entry>client character set representation</entry>
1578       <entry><literal>'\176'::bytea</literal></entry>
1579       <entry><literal>~</literal></entry>
1580      </row>
1581
1582     </tbody>
1583    </tgroup>
1584   </table>
1585
1586   <para>
1587    Depending on the front end to <productname>PostgreSQL</productname> you use,
1588    you might have additional work to do in terms of escaping and
1589    unescaping <type>bytea</type> strings. For example, you might also
1590    have to escape line feeds and carriage returns if your interface
1591    automatically translates these.
1592   </para>
1593  </sect2>
1594 </sect1>
1595
1596
1597  <sect1 id="datatype-datetime">
1598   <title>Date/Time Types</title>
1599
1600   <indexterm zone="datatype-datetime">
1601    <primary>date</primary>
1602   </indexterm>
1603   <indexterm zone="datatype-datetime">
1604    <primary>time</primary>
1605   </indexterm>
1606   <indexterm zone="datatype-datetime">
1607    <primary>time without time zone</primary>
1608   </indexterm>
1609   <indexterm zone="datatype-datetime">
1610    <primary>time with time zone</primary>
1611   </indexterm>
1612   <indexterm zone="datatype-datetime">
1613    <primary>timestamp</primary>
1614   </indexterm>
1615   <indexterm zone="datatype-datetime">
1616    <primary>timestamptz</primary>
1617   </indexterm>
1618   <indexterm zone="datatype-datetime">
1619    <primary>timestamp with time zone</primary>
1620   </indexterm>
1621   <indexterm zone="datatype-datetime">
1622    <primary>timestamp without time zone</primary>
1623   </indexterm>
1624   <indexterm zone="datatype-datetime">
1625    <primary>interval</primary>
1626   </indexterm>
1627   <indexterm zone="datatype-datetime">
1628    <primary>time span</primary>
1629   </indexterm>
1630
1631   <para>
1632    <productname>PostgreSQL</productname> supports the full set of
1633    <acronym>SQL</acronym> date and time types, shown in <xref
1634    linkend="datatype-datetime-table"/>.  The operations available
1635    on these data types are described in
1636    <xref linkend="functions-datetime"/>.
1637    Dates are counted according to the Gregorian calendar, even in
1638    years before that calendar was introduced (see <xref
1639    linkend="datetime-units-history"/> for more information).
1640   </para>
1641
1642    <table id="datatype-datetime-table">
1643     <title>Date/Time Types</title>
1644     <tgroup cols="6">
1645      <thead>
1646       <row>
1647        <entry>Name</entry>
1648        <entry>Storage Size</entry>
1649        <entry>Description</entry>
1650        <entry>Low Value</entry>
1651        <entry>High Value</entry>
1652        <entry>Resolution</entry>
1653       </row>
1654      </thead>
1655      <tbody>
1656       <row>
1657        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1658        <entry>8 bytes</entry>
1659        <entry>both date and time (no time zone)</entry>
1660        <entry>4713 BC</entry>
1661        <entry>294276 AD</entry>
1662        <entry>1 microsecond</entry>
1663       </row>
1664       <row>
1665        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1666        <entry>8 bytes</entry>
1667        <entry>both date and time, with time zone</entry>
1668        <entry>4713 BC</entry>
1669        <entry>294276 AD</entry>
1670        <entry>1 microsecond</entry>
1671       </row>
1672       <row>
1673        <entry><type>date</type></entry>
1674        <entry>4 bytes</entry>
1675        <entry>date (no time of day)</entry>
1676        <entry>4713 BC</entry>
1677        <entry>5874897 AD</entry>
1678        <entry>1 day</entry>
1679       </row>
1680       <row>
1681        <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1682        <entry>8 bytes</entry>
1683        <entry>time of day (no date)</entry>
1684        <entry>00:00:00</entry>
1685        <entry>24:00:00</entry>
1686        <entry>1 microsecond</entry>
1687       </row>
1688       <row>
1689        <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1690        <entry>12 bytes</entry>
1691        <entry>time of day (no date), with time zone</entry>
1692        <!-- see MAX_TZDISP_HOUR in datatype/timestamp.h -->
1693        <entry>00:00:00+1559</entry>
1694        <entry>24:00:00-1559</entry>
1695        <entry>1 microsecond</entry>
1696       </row>
1697       <row>
1698        <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
1699        <entry>16 bytes</entry>
1700        <entry>time interval</entry>
1701        <entry>-178000000 years</entry>
1702        <entry>178000000 years</entry>
1703        <entry>1 microsecond</entry>
1704       </row>
1705      </tbody>
1706     </tgroup>
1707    </table>
1708
1709   <note>
1710    <para>
1711     The SQL standard requires that writing just <type>timestamp</type>
1712     be equivalent to <type>timestamp without time
1713     zone</type>, and <productname>PostgreSQL</productname> honors that
1714     behavior.  <type>timestamptz</type> is accepted as an
1715     abbreviation for <type>timestamp with time zone</type>; this is a
1716     <productname>PostgreSQL</productname> extension.
1717    </para>
1718   </note>
1719
1720   <para>
1721    <type>time</type>, <type>timestamp</type>, and
1722    <type>interval</type> accept an optional precision value
1723    <replaceable>p</replaceable> which specifies the number of
1724    fractional digits retained in the seconds field. By default, there
1725    is no explicit bound on precision.  The allowed range of
1726    <replaceable>p</replaceable> is from 0 to 6.
1727   </para>
1728
1729   <para>
1730    The <type>interval</type> type has an additional option, which is
1731    to restrict the set of stored fields by writing one of these phrases:
1732<literallayout class="monospaced">
1733YEAR
1734MONTH
1735DAY
1736HOUR
1737MINUTE
1738SECOND
1739YEAR TO MONTH
1740DAY TO HOUR
1741DAY TO MINUTE
1742DAY TO SECOND
1743HOUR TO MINUTE
1744HOUR TO SECOND
1745MINUTE TO SECOND
1746</literallayout>
1747    Note that if both <replaceable>fields</replaceable> and
1748    <replaceable>p</replaceable> are specified, the
1749    <replaceable>fields</replaceable> must include <literal>SECOND</literal>,
1750    since the precision applies only to the seconds.
1751   </para>
1752
1753   <para>
1754    The type <type>time with time zone</type> is defined by the SQL
1755    standard, but the definition exhibits properties which lead to
1756    questionable usefulness. In most cases, a combination of
1757    <type>date</type>, <type>time</type>, <type>timestamp without time
1758    zone</type>, and <type>timestamp with time zone</type> should
1759    provide a complete range of date/time functionality required by
1760    any application.
1761   </para>
1762
1763   <sect2 id="datatype-datetime-input">
1764    <title>Date/Time Input</title>
1765
1766    <para>
1767     Date and time input is accepted in almost any reasonable format, including
1768     ISO 8601, <acronym>SQL</acronym>-compatible,
1769     traditional <productname>POSTGRES</productname>, and others.
1770     For some formats, ordering of day, month, and year in date input is
1771     ambiguous and there is support for specifying the expected
1772     ordering of these fields.  Set the <xref linkend="guc-datestyle"/> parameter
1773     to <literal>MDY</literal> to select month-day-year interpretation,
1774     <literal>DMY</literal> to select day-month-year interpretation, or
1775     <literal>YMD</literal> to select year-month-day interpretation.
1776    </para>
1777
1778    <para>
1779     <productname>PostgreSQL</productname> is more flexible in
1780     handling date/time input than the
1781     <acronym>SQL</acronym> standard requires.
1782     See <xref linkend="datetime-appendix"/>
1783     for the exact parsing rules of date/time input and for the
1784     recognized text fields including months, days of the week, and
1785     time zones.
1786    </para>
1787
1788    <para>
1789     Remember that any date or time literal input needs to be enclosed
1790     in single quotes, like text strings.  Refer to
1791     <xref linkend="sql-syntax-constants-generic"/> for more
1792     information.
1793     <acronym>SQL</acronym> requires the following syntax
1794<synopsis>
1795<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
1796</synopsis>
1797     where <replaceable>p</replaceable> is an optional precision
1798     specification giving the number of
1799     fractional digits in the seconds field. Precision can be
1800     specified for <type>time</type>, <type>timestamp</type>, and
1801     <type>interval</type> types, and can range from 0 to 6.
1802     If no precision is specified in a constant specification,
1803     it defaults to the precision of the literal value (but not
1804     more than 6 digits).
1805    </para>
1806
1807    <sect3>
1808    <title>Dates</title>
1809
1810    <indexterm>
1811     <primary>date</primary>
1812    </indexterm>
1813
1814    <para>
1815     <xref linkend="datatype-datetime-date-table"/> shows some possible
1816     inputs for the <type>date</type> type.
1817    </para>
1818
1819     <table id="datatype-datetime-date-table">
1820      <title>Date Input</title>
1821      <tgroup cols="2">
1822       <colspec colname="col1" colwidth="1*"/>
1823       <colspec colname="col2" colwidth="2*"/>
1824       <thead>
1825        <row>
1826         <entry>Example</entry>
1827         <entry>Description</entry>
1828        </row>
1829       </thead>
1830       <tbody>
1831        <row>
1832         <entry>1999-01-08</entry>
1833         <entry>ISO 8601; January 8 in any mode
1834         (recommended format)</entry>
1835        </row>
1836        <row>
1837         <entry>January 8, 1999</entry>
1838         <entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
1839        </row>
1840        <row>
1841         <entry>1/8/1999</entry>
1842         <entry>January 8 in <literal>MDY</literal> mode;
1843          August 1 in <literal>DMY</literal> mode</entry>
1844        </row>
1845        <row>
1846         <entry>1/18/1999</entry>
1847         <entry>January 18 in <literal>MDY</literal> mode;
1848          rejected in other modes</entry>
1849        </row>
1850        <row>
1851         <entry>01/02/03</entry>
1852         <entry>January 2, 2003 in <literal>MDY</literal> mode;
1853          February 1, 2003 in <literal>DMY</literal> mode;
1854          February 3, 2001 in <literal>YMD</literal> mode
1855         </entry>
1856        </row>
1857        <row>
1858         <entry>1999-Jan-08</entry>
1859         <entry>January 8 in any mode</entry>
1860        </row>
1861        <row>
1862         <entry>Jan-08-1999</entry>
1863         <entry>January 8 in any mode</entry>
1864        </row>
1865        <row>
1866         <entry>08-Jan-1999</entry>
1867         <entry>January 8 in any mode</entry>
1868        </row>
1869        <row>
1870         <entry>99-Jan-08</entry>
1871         <entry>January 8 in <literal>YMD</literal> mode, else error</entry>
1872        </row>
1873        <row>
1874         <entry>08-Jan-99</entry>
1875         <entry>January 8, except error in <literal>YMD</literal> mode</entry>
1876        </row>
1877        <row>
1878         <entry>Jan-08-99</entry>
1879         <entry>January 8, except error in <literal>YMD</literal> mode</entry>
1880        </row>
1881        <row>
1882         <entry>19990108</entry>
1883         <entry>ISO 8601; January 8, 1999 in any mode</entry>
1884        </row>
1885        <row>
1886         <entry>990108</entry>
1887         <entry>ISO 8601; January 8, 1999 in any mode</entry>
1888        </row>
1889        <row>
1890         <entry>1999.008</entry>
1891         <entry>year and day of year</entry>
1892        </row>
1893        <row>
1894         <entry>J2451187</entry>
1895         <entry>Julian date</entry>
1896        </row>
1897        <row>
1898         <entry>January 8, 99 BC</entry>
1899         <entry>year 99 BC</entry>
1900        </row>
1901       </tbody>
1902      </tgroup>
1903     </table>
1904    </sect3>
1905
1906    <sect3>
1907     <title>Times</title>
1908
1909     <indexterm>
1910      <primary>time</primary>
1911     </indexterm>
1912     <indexterm>
1913      <primary>time without time zone</primary>
1914     </indexterm>
1915     <indexterm>
1916      <primary>time with time zone</primary>
1917     </indexterm>
1918
1919     <para>
1920      The time-of-day types are <type>time [
1921      (<replaceable>p</replaceable>) ] without time zone</type> and
1922      <type>time [ (<replaceable>p</replaceable>) ] with time
1923      zone</type>.  <type>time</type> alone is equivalent to
1924      <type>time without time zone</type>.
1925     </para>
1926
1927     <para>
1928      Valid input for these types consists of a time of day followed
1929      by an optional time zone. (See <xref
1930      linkend="datatype-datetime-time-table"/>
1931      and <xref linkend="datatype-timezone-table"/>.)  If a time zone is
1932      specified in the input for <type>time without time zone</type>,
1933      it is silently ignored. You can also specify a date but it will
1934      be ignored, except when you use a time zone name that involves a
1935      daylight-savings rule, such as
1936      <literal>America/New_York</literal>. In this case specifying the date
1937      is required in order to determine whether standard or daylight-savings
1938      time applies.  The appropriate time zone offset is recorded in the
1939      <type>time with time zone</type> value.
1940     </para>
1941
1942      <table id="datatype-datetime-time-table">
1943       <title>Time Input</title>
1944       <tgroup cols="2">
1945        <colspec colname="col1" colwidth="3*"/>
1946        <colspec colname="col2" colwidth="2*"/>
1947        <thead>
1948         <row>
1949          <entry>Example</entry>
1950          <entry>Description</entry>
1951         </row>
1952        </thead>
1953        <tbody>
1954         <row>
1955          <entry><literal>04:05:06.789</literal></entry>
1956          <entry>ISO 8601</entry>
1957         </row>
1958         <row>
1959          <entry><literal>04:05:06</literal></entry>
1960          <entry>ISO 8601</entry>
1961         </row>
1962         <row>
1963          <entry><literal>04:05</literal></entry>
1964          <entry>ISO 8601</entry>
1965         </row>
1966         <row>
1967          <entry><literal>040506</literal></entry>
1968          <entry>ISO 8601</entry>
1969         </row>
1970         <row>
1971          <entry><literal>04:05 AM</literal></entry>
1972          <entry>same as 04:05; AM does not affect value</entry>
1973         </row>
1974         <row>
1975          <entry><literal>04:05 PM</literal></entry>
1976          <entry>same as 16:05; input hour must be &lt;= 12</entry>
1977         </row>
1978         <row>
1979          <entry><literal>04:05:06.789-8</literal></entry>
1980          <entry>ISO 8601, with time zone as UTC offset</entry>
1981         </row>
1982         <row>
1983          <entry><literal>04:05:06-08:00</literal></entry>
1984          <entry>ISO 8601, with time zone as UTC offset</entry>
1985         </row>
1986         <row>
1987          <entry><literal>04:05-08:00</literal></entry>
1988          <entry>ISO 8601, with time zone as UTC offset</entry>
1989         </row>
1990         <row>
1991          <entry><literal>040506-08</literal></entry>
1992          <entry>ISO 8601, with time zone as UTC offset</entry>
1993         </row>
1994         <row>
1995          <entry><literal>040506+0730</literal></entry>
1996          <entry>ISO 8601, with fractional-hour time zone as UTC offset</entry>
1997         </row>
1998         <row>
1999          <entry><literal>040506+07:30:00</literal></entry>
2000          <entry>UTC offset specified to seconds (not allowed in ISO 8601)</entry>
2001         </row>
2002         <row>
2003          <entry><literal>04:05:06 PST</literal></entry>
2004          <entry>time zone specified by abbreviation</entry>
2005         </row>
2006         <row>
2007          <entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>
2008          <entry>time zone specified by full name</entry>
2009         </row>
2010        </tbody>
2011       </tgroup>
2012      </table>
2013
2014      <table tocentry="1" id="datatype-timezone-table">
2015       <title>Time Zone Input</title>
2016       <tgroup cols="2">
2017        <thead>
2018         <row>
2019          <entry>Example</entry>
2020          <entry>Description</entry>
2021         </row>
2022        </thead>
2023        <tbody>
2024         <row>
2025          <entry><literal>PST</literal></entry>
2026          <entry>Abbreviation (for Pacific Standard Time)</entry>
2027         </row>
2028         <row>
2029          <entry><literal>America/New_York</literal></entry>
2030          <entry>Full time zone name</entry>
2031         </row>
2032         <row>
2033          <entry><literal>PST8PDT</literal></entry>
2034          <entry>POSIX-style time zone specification</entry>
2035         </row>
2036         <row>
2037          <entry><literal>-8:00:00</literal></entry>
2038          <entry>UTC offset for PST</entry>
2039         </row>
2040         <row>
2041          <entry><literal>-8:00</literal></entry>
2042          <entry>UTC offset for PST (ISO 8601 extended format)</entry>
2043         </row>
2044         <row>
2045          <entry><literal>-800</literal></entry>
2046          <entry>UTC offset for PST (ISO 8601 basic format)</entry>
2047         </row>
2048         <row>
2049          <entry><literal>-8</literal></entry>
2050          <entry>UTC offset for PST (ISO 8601 basic format)</entry>
2051         </row>
2052         <row>
2053          <entry><literal>zulu</literal></entry>
2054          <entry>Military abbreviation for UTC</entry>
2055         </row>
2056         <row>
2057          <entry><literal>z</literal></entry>
2058          <entry>Short form of <literal>zulu</literal> (also in ISO 8601)</entry>
2059         </row>
2060        </tbody>
2061       </tgroup>
2062      </table>
2063
2064     <para>
2065     Refer to <xref linkend="datatype-timezones"/> for more information on how
2066     to specify time zones.
2067    </para>
2068    </sect3>
2069
2070    <sect3>
2071    <title>Time Stamps</title>
2072
2073    <indexterm>
2074     <primary>timestamp</primary>
2075    </indexterm>
2076
2077    <indexterm>
2078     <primary>timestamp with time zone</primary>
2079    </indexterm>
2080
2081    <indexterm>
2082     <primary>timestamp without time zone</primary>
2083    </indexterm>
2084
2085     <para>
2086      Valid input for the time stamp types consists of the concatenation
2087      of a date and a time, followed by an optional time zone,
2088      followed by an optional <literal>AD</literal> or <literal>BC</literal>.
2089      (Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
2090      before the time zone, but this is not the preferred ordering.)
2091      Thus:
2092
2093<programlisting>
20941999-01-08 04:05:06
2095</programlisting>
2096      and:
2097<programlisting>
20981999-01-08 04:05:06 -8:00
2099</programlisting>
2100
2101      are valid values, which follow the <acronym>ISO</acronym> 8601
2102      standard.  In addition, the common format:
2103<programlisting>
2104January 8 04:05:06 1999 PST
2105</programlisting>
2106      is supported.
2107     </para>
2108
2109     <para>
2110      The <acronym>SQL</acronym> standard differentiates
2111      <type>timestamp without time zone</type>
2112      and <type>timestamp with time zone</type> literals by the presence of a
2113      <quote>+</quote> or <quote>-</quote> symbol and time zone offset after
2114      the time.  Hence, according to the standard,
2115
2116<programlisting>
2117TIMESTAMP '2004-10-19 10:23:54'
2118</programlisting>
2119
2120      is a <type>timestamp without time zone</type>, while
2121
2122<programlisting>
2123TIMESTAMP '2004-10-19 10:23:54+02'
2124</programlisting>
2125
2126      is a <type>timestamp with time zone</type>.
2127      <productname>PostgreSQL</productname> never examines the content of a
2128      literal string before determining its type, and therefore will treat
2129      both of the above as <type>timestamp without time zone</type>.  To
2130      ensure that a literal is treated as <type>timestamp with time
2131      zone</type>, give it the correct explicit type:
2132
2133<programlisting>
2134TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'
2135</programlisting>
2136
2137      In a literal that has been determined to be <type>timestamp without time
2138      zone</type>, <productname>PostgreSQL</productname> will silently ignore
2139      any time zone indication.
2140      That is, the resulting value is derived from the date/time
2141      fields in the input value, and is not adjusted for time zone.
2142     </para>
2143
2144     <para>
2145      For <type>timestamp with time zone</type>, the internally stored
2146      value is always in UTC (Universal
2147      Coordinated Time, traditionally known as Greenwich Mean Time,
2148      <acronym>GMT</acronym>).  An input value that has an explicit
2149      time zone specified is converted to UTC using the appropriate offset
2150      for that time zone.  If no time zone is stated in the input string,
2151      then it is assumed to be in the time zone indicated by the system's
2152      <xref linkend="guc-timezone"/> parameter, and is converted to UTC using the
2153      offset for the <varname>timezone</varname> zone.
2154     </para>
2155
2156     <para>
2157      When a <type>timestamp with time
2158      zone</type> value is output, it is always converted from UTC to the
2159      current <varname>timezone</varname> zone, and displayed as local time in that
2160      zone.  To see the time in another time zone, either change
2161      <varname>timezone</varname> or use the <literal>AT TIME ZONE</literal> construct
2162      (see <xref linkend="functions-datetime-zoneconvert"/>).
2163     </para>
2164
2165     <para>
2166      Conversions between <type>timestamp without time zone</type> and
2167      <type>timestamp with time zone</type> normally assume that the
2168      <type>timestamp without time zone</type> value should be taken or given
2169      as <varname>timezone</varname> local time.  A different time zone can
2170      be specified for the conversion using <literal>AT TIME ZONE</literal>.
2171     </para>
2172    </sect3>
2173
2174    <sect3 id="datatype-datetime-special-values">
2175     <title>Special Values</title>
2176
2177     <indexterm>
2178      <primary>time</primary>
2179      <secondary>constants</secondary>
2180     </indexterm>
2181
2182     <indexterm>
2183      <primary>date</primary>
2184      <secondary>constants</secondary>
2185     </indexterm>
2186
2187     <para>
2188      <productname>PostgreSQL</productname> supports several
2189      special date/time input values for convenience, as shown in <xref
2190      linkend="datatype-datetime-special-table"/>.  The values
2191      <literal>infinity</literal> and <literal>-infinity</literal>
2192      are specially represented inside the system and will be displayed
2193      unchanged; but the others are simply notational shorthands
2194      that will be converted to ordinary date/time values when read.
2195      (In particular, <literal>now</literal> and related strings are converted
2196      to a specific time value as soon as they are read.)
2197      All of these values need to be enclosed in single quotes when used
2198      as constants in SQL commands.
2199     </para>
2200
2201      <table id="datatype-datetime-special-table">
2202       <title>Special Date/Time Inputs</title>
2203       <tgroup cols="3">
2204        <thead>
2205         <row>
2206          <entry>Input String</entry>
2207          <entry>Valid Types</entry>
2208          <entry>Description</entry>
2209         </row>
2210        </thead>
2211        <tbody>
2212         <row>
2213          <entry><literal>epoch</literal></entry>
2214          <entry><type>date</type>, <type>timestamp</type></entry>
2215          <entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
2216         </row>
2217         <row>
2218          <entry><literal>infinity</literal></entry>
2219          <entry><type>date</type>, <type>timestamp</type></entry>
2220          <entry>later than all other time stamps</entry>
2221         </row>
2222         <row>
2223          <entry><literal>-infinity</literal></entry>
2224          <entry><type>date</type>, <type>timestamp</type></entry>
2225          <entry>earlier than all other time stamps</entry>
2226         </row>
2227         <row>
2228          <entry><literal>now</literal></entry>
2229          <entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
2230          <entry>current transaction's start time</entry>
2231         </row>
2232         <row>
2233          <entry><literal>today</literal></entry>
2234          <entry><type>date</type>, <type>timestamp</type></entry>
2235          <entry>midnight (<literal>00:00</literal>) today</entry>
2236         </row>
2237         <row>
2238          <entry><literal>tomorrow</literal></entry>
2239          <entry><type>date</type>, <type>timestamp</type></entry>
2240          <entry>midnight (<literal>00:00</literal>) tomorrow</entry>
2241         </row>
2242         <row>
2243          <entry><literal>yesterday</literal></entry>
2244          <entry><type>date</type>, <type>timestamp</type></entry>
2245          <entry>midnight (<literal>00:00</literal>) yesterday</entry>
2246         </row>
2247         <row>
2248          <entry><literal>allballs</literal></entry>
2249          <entry><type>time</type></entry>
2250          <entry>00:00:00.00 UTC</entry>
2251         </row>
2252        </tbody>
2253       </tgroup>
2254      </table>
2255
2256     <para>
2257      The following <acronym>SQL</acronym>-compatible functions can also
2258      be used to obtain the current time value for the corresponding data
2259      type:
2260      <literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,
2261      <literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
2262      <literal>LOCALTIMESTAMP</literal>.  (See <xref
2263      linkend="functions-datetime-current"/>.)  Note that these are
2264      SQL functions and are <emphasis>not</emphasis> recognized in data input strings.
2265     </para>
2266
2267     <caution>
2268      <para>
2269       While the input strings <literal>now</literal>,
2270       <literal>today</literal>, <literal>tomorrow</literal>,
2271       and <literal>yesterday</literal> are fine to use in interactive SQL
2272       commands, they can have surprising behavior when the command is
2273       saved to be executed later, for example in prepared statements,
2274       views, and function definitions.  The string can be converted to a
2275       specific time value that continues to be used long after it becomes
2276       stale.  Use one of the SQL functions instead in such contexts.
2277       For example, <literal>CURRENT_DATE + 1</literal> is safer than
2278       <literal>'tomorrow'::date</literal>.
2279      </para>
2280     </caution>
2281
2282    </sect3>
2283   </sect2>
2284
2285   <sect2 id="datatype-datetime-output">
2286    <title>Date/Time Output</title>
2287
2288    <indexterm>
2289     <primary>date</primary>
2290     <secondary>output format</secondary>
2291     <seealso>formatting</seealso>
2292    </indexterm>
2293
2294    <indexterm>
2295     <primary>time</primary>
2296     <secondary>output format</secondary>
2297     <seealso>formatting</seealso>
2298    </indexterm>
2299
2300    <para>
2301     The output format of the date/time types can be set to one of the four
2302     styles ISO 8601,
2303     <acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</productname>
2304     (Unix <application>date</application> format), or
2305     German.  The default
2306     is the <acronym>ISO</acronym> format.  (The
2307     <acronym>SQL</acronym> standard requires the use of the ISO 8601
2308     format.  The name of the <quote>SQL</quote> output format is a
2309     historical accident.)  <xref
2310     linkend="datatype-datetime-output-table"/> shows examples of each
2311     output style.  The output of the <type>date</type> and
2312     <type>time</type> types is generally only the date or time part
2313     in accordance with the given examples.  However, the
2314     <productname>POSTGRES</productname> style outputs date-only values in
2315     <acronym>ISO</acronym> format.
2316    </para>
2317
2318     <table id="datatype-datetime-output-table">
2319      <title>Date/Time Output Styles</title>
2320      <tgroup cols="3">
2321       <colspec colname="col1" colwidth="1*"/>
2322       <colspec colname="col2" colwidth="1*"/>
2323       <colspec colname="col3" colwidth="2*"/>
2324       <thead>
2325        <row>
2326         <entry>Style Specification</entry>
2327         <entry>Description</entry>
2328         <entry>Example</entry>
2329        </row>
2330       </thead>
2331       <tbody>
2332        <row>
2333         <entry><literal>ISO</literal></entry>
2334         <entry>ISO 8601, SQL standard</entry>
2335         <entry><literal>1997-12-17 07:37:16-08</literal></entry>
2336        </row>
2337        <row>
2338         <entry><literal>SQL</literal></entry>
2339         <entry>traditional style</entry>
2340         <entry><literal>12/17/1997 07:37:16.00 PST</literal></entry>
2341        </row>
2342        <row>
2343         <entry><literal>Postgres</literal></entry>
2344         <entry>original style</entry>
2345         <entry><literal>Wed Dec 17 07:37:16 1997 PST</literal></entry>
2346        </row>
2347        <row>
2348         <entry><literal>German</literal></entry>
2349         <entry>regional style</entry>
2350         <entry><literal>17.12.1997 07:37:16.00 PST</literal></entry>
2351        </row>
2352       </tbody>
2353      </tgroup>
2354     </table>
2355
2356    <note>
2357     <para>
2358      ISO 8601 specifies the use of uppercase letter <literal>T</literal> to separate
2359      the date and time.  <productname>PostgreSQL</productname> accepts that format on
2360      input, but on output it uses a space rather than <literal>T</literal>, as shown
2361      above.  This is for readability and for consistency with RFC 3339 as
2362      well as some other database systems.
2363     </para>
2364    </note>
2365
2366    <para>
2367     In the <acronym>SQL</acronym> and POSTGRES styles, day appears before
2368     month if DMY field ordering has been specified, otherwise month appears
2369     before day.
2370     (See <xref linkend="datatype-datetime-input"/>
2371     for how this setting also affects interpretation of input values.)
2372     <xref linkend="datatype-datetime-output2-table"/> shows examples.
2373    </para>
2374
2375     <table id="datatype-datetime-output2-table">
2376      <title>Date Order Conventions</title>
2377      <tgroup cols="3">
2378       <colspec colname="col1" colwidth="1*"/>
2379       <colspec colname="col2" colwidth="1*"/>
2380       <colspec colname="col3" colwidth="2*"/>
2381       <thead>
2382        <row>
2383         <entry><varname>datestyle</varname> Setting</entry>
2384         <entry>Input Ordering</entry>
2385         <entry>Example Output</entry>
2386        </row>
2387       </thead>
2388       <tbody>
2389        <row>
2390         <entry><literal>SQL, DMY</literal></entry>
2391         <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2392         <entry><literal>17/12/1997 15:37:16.00 CET</literal></entry>
2393        </row>
2394        <row>
2395         <entry><literal>SQL, MDY</literal></entry>
2396         <entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>
2397         <entry><literal>12/17/1997 07:37:16.00 PST</literal></entry>
2398        </row>
2399        <row>
2400         <entry><literal>Postgres, DMY</literal></entry>
2401         <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2402         <entry><literal>Wed 17 Dec 07:37:16 1997 PST</literal></entry>
2403        </row>
2404       </tbody>
2405      </tgroup>
2406     </table>
2407
2408    <para>
2409     In the <acronym>ISO</acronym> style, the time zone is always shown as
2410     a signed numeric offset from UTC, with positive sign used for zones
2411     east of Greenwich.  The offset will be shown
2412     as <replaceable>hh</replaceable> (hours only) if it is an integral
2413     number of hours, else
2414     as <replaceable>hh</replaceable>:<replaceable>mm</replaceable> if it
2415     is an integral number of minutes, else as
2416     <replaceable>hh</replaceable>:<replaceable>mm</replaceable>:<replaceable>ss</replaceable>.
2417     (The third case is not possible with any modern time zone standard,
2418     but it can appear when working with timestamps that predate the
2419     adoption of standardized time zones.)
2420     In the other date styles, the time zone is shown as an alphabetic
2421     abbreviation if one is in common use in the current zone.  Otherwise
2422     it appears as a signed numeric offset in ISO 8601 basic format
2423     (<replaceable>hh</replaceable> or <replaceable>hhmm</replaceable>).
2424    </para>
2425
2426    <para>
2427     The date/time style can be selected by the user using the
2428     <command>SET datestyle</command> command, the <xref
2429     linkend="guc-datestyle"/> parameter in the
2430     <filename>postgresql.conf</filename> configuration file, or the
2431     <envar>PGDATESTYLE</envar> environment variable on the server or
2432     client.
2433    </para>
2434
2435    <para>
2436     The formatting function <function>to_char</function>
2437     (see <xref linkend="functions-formatting"/>) is also available as
2438     a more flexible way to format date/time output.
2439    </para>
2440   </sect2>
2441
2442   <sect2 id="datatype-timezones">
2443    <title>Time Zones</title>
2444
2445    <indexterm zone="datatype-timezones">
2446     <primary>time zone</primary>
2447    </indexterm>
2448
2449   <para>
2450    Time zones, and time-zone conventions, are influenced by
2451    political decisions, not just earth geometry. Time zones around the
2452    world became somewhat standardized during the 1900s,
2453    but continue to be prone to arbitrary changes, particularly with
2454    respect to daylight-savings rules.
2455    <productname>PostgreSQL</productname> uses the widely-used
2456    IANA (Olson) time zone database for information about
2457    historical time zone rules.  For times in the future, the assumption
2458    is that the latest known rules for a given time zone will
2459    continue to be observed indefinitely far into the future.
2460   </para>
2461
2462    <para>
2463     <productname>PostgreSQL</productname> endeavors to be compatible with
2464     the <acronym>SQL</acronym> standard definitions for typical usage.
2465     However, the <acronym>SQL</acronym> standard has an odd mix of date and
2466     time types and capabilities. Two obvious problems are:
2467
2468     <itemizedlist>
2469      <listitem>
2470       <para>
2471        Although the <type>date</type> type
2472        cannot have an associated time zone, the
2473        <type>time</type> type can.
2474        Time zones in the real world have little meaning unless
2475        associated with a date as well as a time,
2476        since the offset can vary through the year with daylight-saving
2477        time boundaries.
2478       </para>
2479      </listitem>
2480
2481      <listitem>
2482       <para>
2483        The default time zone is specified as a constant numeric offset
2484        from <acronym>UTC</acronym>. It is therefore impossible to adapt to
2485        daylight-saving time when doing date/time arithmetic across
2486        <acronym>DST</acronym> boundaries.
2487       </para>
2488      </listitem>
2489
2490     </itemizedlist>
2491    </para>
2492
2493    <para>
2494     To address these difficulties, we recommend using date/time types
2495     that contain both date and time when using time zones. We
2496     do <emphasis>not</emphasis> recommend using the type <type>time with
2497     time zone</type> (though it is supported by
2498     <productname>PostgreSQL</productname> for legacy applications and
2499     for compliance with the <acronym>SQL</acronym> standard).
2500     <productname>PostgreSQL</productname> assumes
2501     your local time zone for any type containing only date or time.
2502    </para>
2503
2504    <para>
2505     All timezone-aware dates and times are stored internally in
2506     <acronym>UTC</acronym>.  They are converted to local time
2507     in the zone specified by the <xref linkend="guc-timezone"/> configuration
2508     parameter before being displayed to the client.
2509    </para>
2510
2511    <para>
2512     <productname>PostgreSQL</productname> allows you to specify time zones in
2513     three different forms:
2514     <itemizedlist>
2515      <listitem>
2516       <para>
2517        A full time zone name, for example <literal>America/New_York</literal>.
2518        The recognized time zone names are listed in the
2519        <literal>pg_timezone_names</literal> view (see <xref
2520        linkend="view-pg-timezone-names"/>).
2521        <productname>PostgreSQL</productname> uses the widely-used IANA
2522        time zone data for this purpose, so the same time zone
2523        names are also recognized by other software.
2524       </para>
2525      </listitem>
2526      <listitem>
2527       <para>
2528        A time zone abbreviation, for example <literal>PST</literal>.  Such a
2529        specification merely defines a particular offset from UTC, in
2530        contrast to full time zone names which can imply a set of daylight
2531        savings transition rules as well.  The recognized abbreviations
2532        are listed in the <literal>pg_timezone_abbrevs</literal> view (see <xref
2533        linkend="view-pg-timezone-abbrevs"/>).  You cannot set the
2534        configuration parameters <xref linkend="guc-timezone"/> or
2535        <xref linkend="guc-log-timezone"/> to a time
2536        zone abbreviation, but you can use abbreviations in
2537        date/time input values and with the <literal>AT TIME ZONE</literal>
2538        operator.
2539       </para>
2540      </listitem>
2541      <listitem>
2542       <para>
2543        In addition to the timezone names and abbreviations,
2544        <productname>PostgreSQL</productname> will accept POSIX-style time zone
2545        specifications, as described in
2546        <xref linkend="datetime-posix-timezone-specs"/>.  This option is not
2547        normally preferable to using a named time zone, but it may be
2548        necessary if no suitable IANA time zone entry is available.
2549       </para>
2550      </listitem>
2551     </itemizedlist>
2552
2553     In short, this is the difference between abbreviations
2554     and full names: abbreviations represent a specific offset from UTC,
2555     whereas many of the full names imply a local daylight-savings time
2556     rule, and so have two possible UTC offsets.  As an example,
2557     <literal>2014-06-04 12:00 America/New_York</literal> represents noon local
2558     time in New York, which for this particular date was Eastern Daylight
2559     Time (UTC-4).  So <literal>2014-06-04 12:00 EDT</literal> specifies that
2560     same time instant.  But <literal>2014-06-04 12:00 EST</literal> specifies
2561     noon Eastern Standard Time (UTC-5), regardless of whether daylight
2562     savings was nominally in effect on that date.
2563    </para>
2564
2565    <para>
2566     To complicate matters, some jurisdictions have used the same timezone
2567     abbreviation to mean different UTC offsets at different times; for
2568     example, in Moscow <literal>MSK</literal> has meant UTC+3 in some years and
2569     UTC+4 in others.  <application>PostgreSQL</application> interprets such
2570     abbreviations according to whatever they meant (or had most recently
2571     meant) on the specified date; but, as with the <literal>EST</literal> example
2572     above, this is not necessarily the same as local civil time on that date.
2573    </para>
2574
2575    <para>
2576     In all cases, timezone names and abbreviations are recognized
2577     case-insensitively.  (This is a change from <productname>PostgreSQL</productname>
2578     versions prior to 8.2, which were case-sensitive in some contexts but
2579     not others.)
2580    </para>
2581
2582    <para>
2583     Neither timezone names nor abbreviations are hard-wired into the server;
2584     they are obtained from configuration files stored under
2585     <filename>.../share/timezone/</filename> and <filename>.../share/timezonesets/</filename>
2586     of the installation directory
2587     (see <xref linkend="datetime-config-files"/>).
2588    </para>
2589
2590    <para>
2591     The <xref linkend="guc-timezone"/> configuration parameter can
2592     be set in the file <filename>postgresql.conf</filename>, or in any of the
2593     other standard ways described in <xref linkend="runtime-config"/>.
2594     There are also some special ways to set it:
2595
2596     <itemizedlist>
2597      <listitem>
2598       <para>
2599        The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>
2600        sets the time zone for the session.  This is an alternative spelling
2601        of <command>SET TIMEZONE TO</command> with a more SQL-spec-compatible syntax.
2602       </para>
2603      </listitem>
2604
2605      <listitem>
2606       <para>
2607        The <envar>PGTZ</envar> environment variable is used by
2608        <application>libpq</application> clients
2609        to send a <command>SET TIME ZONE</command>
2610        command to the server upon connection.
2611       </para>
2612      </listitem>
2613     </itemizedlist>
2614    </para>
2615   </sect2>
2616
2617   <sect2 id="datatype-interval-input">
2618    <title>Interval Input</title>
2619
2620    <indexterm>
2621     <primary>interval</primary>
2622    </indexterm>
2623
2624     <para>
2625      <type>interval</type> values can be written using the following
2626      verbose syntax:
2627
2628<synopsis>
2629<optional>@</optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> <optional><replaceable>quantity</replaceable> <replaceable>unit</replaceable>...</optional> <optional><replaceable>direction</replaceable></optional>
2630</synopsis>
2631
2632     where <replaceable>quantity</replaceable> is a number (possibly signed);
2633     <replaceable>unit</replaceable> is <literal>microsecond</literal>,
2634     <literal>millisecond</literal>, <literal>second</literal>,
2635     <literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
2636     <literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
2637     <literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
2638     or abbreviations or plurals of these units;
2639     <replaceable>direction</replaceable> can be <literal>ago</literal> or
2640     empty.  The at sign (<literal>@</literal>) is optional noise.  The amounts
2641     of the different units are implicitly added with appropriate
2642     sign accounting.  <literal>ago</literal> negates all the fields.
2643     This syntax is also used for interval output, if
2644     <xref linkend="guc-intervalstyle"/> is set to
2645     <literal>postgres_verbose</literal>.
2646    </para>
2647
2648    <para>
2649     Quantities of days, hours, minutes, and seconds can be specified without
2650     explicit unit markings.  For example, <literal>'1 12:59:10'</literal> is read
2651     the same as <literal>'1 day 12 hours 59 min 10 sec'</literal>.  Also,
2652     a combination of years and months can be specified with a dash;
2653     for example <literal>'200-10'</literal> is read the same as <literal>'200 years
2654     10 months'</literal>.  (These shorter forms are in fact the only ones allowed
2655     by the <acronym>SQL</acronym> standard, and are used for output when
2656     <varname>IntervalStyle</varname> is set to <literal>sql_standard</literal>.)
2657    </para>
2658
2659    <para>
2660     Interval values can also be written as ISO 8601 time intervals, using
2661     either the <quote>format with designators</quote> of the standard's section
2662     4.4.3.2 or the <quote>alternative format</quote> of section 4.4.3.3.  The
2663     format with designators looks like this:
2664<synopsis>
2665P <replaceable>quantity</replaceable> <replaceable>unit</replaceable> <optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> ...</optional> <optional> T <optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> ...</optional></optional>
2666</synopsis>
2667      The string must start with a <literal>P</literal>, and may include a
2668      <literal>T</literal> that introduces the time-of-day units.  The
2669      available unit abbreviations are given in <xref
2670      linkend="datatype-interval-iso8601-units"/>.  Units may be
2671      omitted, and may be specified in any order, but units smaller than
2672      a day must appear after <literal>T</literal>.  In particular, the meaning of
2673      <literal>M</literal> depends on whether it is before or after
2674      <literal>T</literal>.
2675     </para>
2676
2677     <table id="datatype-interval-iso8601-units">
2678      <title>ISO 8601 Interval Unit Abbreviations</title>
2679     <tgroup cols="2">
2680       <thead>
2681        <row>
2682         <entry>Abbreviation</entry>
2683         <entry>Meaning</entry>
2684        </row>
2685       </thead>
2686       <tbody>
2687        <row>
2688         <entry>Y</entry>
2689         <entry>Years</entry>
2690        </row>
2691        <row>
2692         <entry>M</entry>
2693         <entry>Months (in the date part)</entry>
2694        </row>
2695        <row>
2696         <entry>W</entry>
2697         <entry>Weeks</entry>
2698        </row>
2699        <row>
2700         <entry>D</entry>
2701         <entry>Days</entry>
2702        </row>
2703        <row>
2704         <entry>H</entry>
2705         <entry>Hours</entry>
2706        </row>
2707        <row>
2708         <entry>M</entry>
2709         <entry>Minutes (in the time part)</entry>
2710        </row>
2711        <row>
2712         <entry>S</entry>
2713         <entry>Seconds</entry>
2714        </row>
2715       </tbody>
2716      </tgroup>
2717     </table>
2718
2719     <para>
2720      In the alternative format:
2721<synopsis>
2722P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-<replaceable>days</replaceable> </optional> <optional> T <replaceable>hours</replaceable>:<replaceable>minutes</replaceable>:<replaceable>seconds</replaceable> </optional>
2723</synopsis>
2724      the string must begin with <literal>P</literal>, and a
2725      <literal>T</literal> separates the date and time parts of the interval.
2726      The values are given as numbers similar to ISO 8601 dates.
2727    </para>
2728
2729    <para>
2730     When writing an interval constant with a <replaceable>fields</replaceable>
2731     specification, or when assigning a string to an interval column that was
2732     defined with a <replaceable>fields</replaceable> specification, the interpretation of
2733     unmarked quantities depends on the <replaceable>fields</replaceable>.  For
2734     example <literal>INTERVAL '1' YEAR</literal> is read as 1 year, whereas
2735     <literal>INTERVAL '1'</literal> means 1 second.  Also, field values
2736     <quote>to the right</quote> of the least significant field allowed by the
2737     <replaceable>fields</replaceable> specification are silently discarded.  For
2738     example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</literal>
2739     results in dropping the seconds field, but not the day field.
2740    </para>
2741
2742    <para>
2743     According to the <acronym>SQL</acronym> standard all fields of an interval
2744     value must have the same sign, so a leading negative sign applies to all
2745     fields; for example the negative sign in the interval literal
2746     <literal>'-1 2:03:04'</literal> applies to both the days and hour/minute/second
2747     parts.  <productname>PostgreSQL</productname> allows the fields to have different
2748     signs, and traditionally treats each field in the textual representation
2749     as independently signed, so that the hour/minute/second part is
2750     considered positive in this example.  If <varname>IntervalStyle</varname> is
2751     set to <literal>sql_standard</literal> then a leading sign is considered
2752     to apply to all fields (but only if no additional signs appear).
2753     Otherwise the traditional <productname>PostgreSQL</productname> interpretation is
2754     used.  To avoid ambiguity, it's recommended to attach an explicit sign
2755     to each field if any field is negative.
2756    </para>
2757
2758    <para>
2759     Field values can have fractional parts:  for example, <literal>'1.5
2760     weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,
2761     because interval internally stores only three integer units (months,
2762     days, microseconds), fractional units must be spilled to smaller
2763     units.  Fractional parts of units greater than months are truncated to
2764     be an integer number of months, e.g. <literal>'1.5 years'</literal>
2765     becomes <literal>'1 year 6 mons'</literal>.  Fractional parts of
2766     weeks and days are computed to be an integer number of days and
2767     microseconds, assuming 30 days per month and 24 hours per day, e.g.,
2768     <literal>'1.75 months'</literal> becomes <literal>1 mon 22 days
2769     12:00:00</literal>.  Only seconds will ever be shown as fractional
2770     on output.
2771    </para>
2772
2773    <para>
2774     <xref linkend="datatype-interval-input-examples"/> shows some examples
2775     of valid <type>interval</type> input.
2776    </para>
2777
2778     <table id="datatype-interval-input-examples">
2779      <title>Interval Input</title>
2780      <tgroup cols="2">
2781       <thead>
2782        <row>
2783         <entry>Example</entry>
2784         <entry>Description</entry>
2785        </row>
2786       </thead>
2787       <tbody>
2788        <row>
2789         <entry><literal>1-2</literal></entry>
2790         <entry>SQL standard format: 1 year 2 months</entry>
2791        </row>
2792        <row>
2793         <entry><literal>3 4:05:06</literal></entry>
2794         <entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>
2795        </row>
2796        <row>
2797         <entry><literal>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</literal></entry>
2798         <entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
2799        </row>
2800        <row>
2801         <entry><literal>P1Y2M3DT4H5M6S</literal></entry>
2802         <entry>ISO 8601 <quote>format with designators</quote>: same meaning as above</entry>
2803        </row>
2804        <row>
2805         <entry><literal>P0001-02-03T04:05:06</literal></entry>
2806         <entry>ISO 8601 <quote>alternative format</quote>: same meaning as above</entry>
2807        </row>
2808       </tbody>
2809      </tgroup>
2810     </table>
2811
2812    <para>
2813     Internally <type>interval</type> values are stored as months, days,
2814     and microseconds. This is done because the number of days in a month
2815     varies, and a day can have 23 or 25 hours if a daylight savings
2816     time adjustment is involved.  The months and days fields are integers
2817     while the microseconds field can store fractional seconds.  Because intervals are
2818     usually created from constant strings or <type>timestamp</type> subtraction,
2819     this storage method works well in most cases, but can cause unexpected
2820     results:
2821
2822<programlisting>
2823SELECT EXTRACT(hours from '80 minutes'::interval);
2824 date_part
2825-----------
2826         1
2827
2828SELECT EXTRACT(days from '80 hours'::interval);
2829 date_part
2830-----------
2831         0
2832</programlisting>
2833
2834     Functions <function>justify_days</function> and
2835     <function>justify_hours</function> are available for adjusting days
2836     and hours that overflow their normal ranges.
2837    </para>
2838
2839   </sect2>
2840
2841   <sect2 id="datatype-interval-output">
2842    <title>Interval Output</title>
2843
2844    <indexterm>
2845     <primary>interval</primary>
2846     <secondary>output format</secondary>
2847     <seealso>formatting</seealso>
2848    </indexterm>
2849
2850    <para>
2851     The output format of the interval type can be set to one of the
2852     four styles <literal>sql_standard</literal>, <literal>postgres</literal>,
2853     <literal>postgres_verbose</literal>, or <literal>iso_8601</literal>,
2854     using the command <literal>SET intervalstyle</literal>.
2855     The default is the <literal>postgres</literal> format.
2856     <xref linkend="interval-style-output-table"/> shows examples of each
2857     output style.
2858    </para>
2859
2860    <para>
2861     The <literal>sql_standard</literal> style produces output that conforms to
2862     the SQL standard's specification for interval literal strings, if
2863     the interval value meets the standard's restrictions (either year-month
2864     only or day-time only, with no mixing of positive
2865     and negative components).  Otherwise the output looks like a standard
2866     year-month literal string followed by a day-time literal string,
2867     with explicit signs added to disambiguate mixed-sign intervals.
2868    </para>
2869
2870    <para>
2871     The output of the <literal>postgres</literal> style matches the output of
2872     <productname>PostgreSQL</productname> releases prior to 8.4 when the
2873     <xref linkend="guc-datestyle"/> parameter was set to <literal>ISO</literal>.
2874    </para>
2875
2876    <para>
2877     The output of the <literal>postgres_verbose</literal> style matches the output of
2878     <productname>PostgreSQL</productname> releases prior to 8.4 when the
2879     <varname>DateStyle</varname> parameter was set to non-<literal>ISO</literal> output.
2880    </para>
2881
2882    <para>
2883     The output of the <literal>iso_8601</literal> style matches the <quote>format
2884     with designators</quote> described in section 4.4.3.2 of the
2885     ISO 8601 standard.
2886    </para>
2887
2888     <table id="interval-style-output-table">
2889       <title>Interval Output Style Examples</title>
2890       <tgroup cols="4">
2891        <thead>
2892         <row>
2893          <entry>Style Specification</entry>
2894          <entry>Year-Month Interval</entry>
2895          <entry>Day-Time Interval</entry>
2896          <entry>Mixed Interval</entry>
2897         </row>
2898        </thead>
2899        <tbody>
2900         <row>
2901          <entry><literal>sql_standard</literal></entry>
2902          <entry>1-2</entry>
2903          <entry>3 4:05:06</entry>
2904          <entry>-1-2 +3 -4:05:06</entry>
2905         </row>
2906         <row>
2907          <entry><literal>postgres</literal></entry>
2908          <entry>1 year 2 mons</entry>
2909          <entry>3 days 04:05:06</entry>
2910          <entry>-1 year -2 mons +3 days -04:05:06</entry>
2911         </row>
2912         <row>
2913          <entry><literal>postgres_verbose</literal></entry>
2914          <entry>@ 1 year 2 mons</entry>
2915          <entry>@ 3 days 4 hours 5 mins 6 secs</entry>
2916          <entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>
2917         </row>
2918         <row>
2919          <entry><literal>iso_8601</literal></entry>
2920          <entry>P1Y2M</entry>
2921          <entry>P3DT4H5M6S</entry>
2922          <entry>P-1Y-2M3D&zwsp;T-4H-5M-6S</entry>
2923         </row>
2924        </tbody>
2925       </tgroup>
2926    </table>
2927
2928   </sect2>
2929
2930  </sect1>
2931
2932  <sect1 id="datatype-boolean">
2933   <title>Boolean Type</title>
2934
2935   <indexterm zone="datatype-boolean">
2936    <primary>Boolean</primary>
2937    <secondary>data type</secondary>
2938   </indexterm>
2939
2940   <indexterm zone="datatype-boolean">
2941    <primary>true</primary>
2942   </indexterm>
2943
2944   <indexterm zone="datatype-boolean">
2945    <primary>false</primary>
2946   </indexterm>
2947
2948   <para>
2949    <productname>PostgreSQL</productname> provides the
2950    standard <acronym>SQL</acronym> type <type>boolean</type>;
2951    see <xref linkend="datatype-boolean-table"/>.
2952    The <type>boolean</type> type can have several states:
2953    <quote>true</quote>, <quote>false</quote>, and a third state,
2954    <quote>unknown</quote>, which is represented by the
2955    <acronym>SQL</acronym> null value.
2956   </para>
2957
2958   <table id="datatype-boolean-table">
2959    <title>Boolean Data Type</title>
2960    <tgroup cols="3">
2961     <thead>
2962      <row>
2963       <entry>Name</entry>
2964       <entry>Storage Size</entry>
2965       <entry>Description</entry>
2966      </row>
2967     </thead>
2968     <tbody>
2969      <row>
2970       <entry><type>boolean</type></entry>
2971       <entry>1 byte</entry>
2972       <entry>state of true or false</entry>
2973      </row>
2974     </tbody>
2975    </tgroup>
2976   </table>
2977
2978   <para>
2979    Boolean constants can be represented in SQL queries by the SQL
2980    key words <literal>TRUE</literal>, <literal>FALSE</literal>,
2981    and <literal>NULL</literal>.
2982   </para>
2983
2984   <para>
2985    The datatype input function for type <type>boolean</type> accepts these
2986    string representations for the <quote>true</quote> state:
2987    <simplelist>
2988     <member><literal>true</literal></member>
2989     <member><literal>yes</literal></member>
2990     <member><literal>on</literal></member>
2991     <member><literal>1</literal></member>
2992    </simplelist>
2993    and these representations for the <quote>false</quote> state:
2994    <simplelist>
2995     <member><literal>false</literal></member>
2996     <member><literal>no</literal></member>
2997     <member><literal>off</literal></member>
2998     <member><literal>0</literal></member>
2999    </simplelist>
3000    Unique prefixes of these strings are also accepted, for
3001    example <literal>t</literal> or <literal>n</literal>.
3002    Leading or trailing whitespace is ignored, and case does not matter.
3003   </para>
3004
3005   <para>
3006    The datatype output function for type <type>boolean</type> always emits
3007    either <literal>t</literal> or <literal>f</literal>, as shown in
3008    <xref linkend="datatype-boolean-example"/>.
3009   </para>
3010
3011   <example id="datatype-boolean-example">
3012    <title>Using the <type>boolean</type> Type</title>
3013
3014<programlisting>
3015CREATE TABLE test1 (a boolean, b text);
3016INSERT INTO test1 VALUES (TRUE, 'sic est');
3017INSERT INTO test1 VALUES (FALSE, 'non est');
3018SELECT * FROM test1;
3019 a |    b
3020---+---------
3021 t | sic est
3022 f | non est
3023
3024SELECT * FROM test1 WHERE a;
3025 a |    b
3026---+---------
3027 t | sic est
3028</programlisting>
3029   </example>
3030
3031   <para>
3032    The key words <literal>TRUE</literal> and <literal>FALSE</literal> are
3033    the preferred (<acronym>SQL</acronym>-compliant) method for writing
3034    Boolean constants in SQL queries.  But you can also use the string
3035    representations by following the generic string-literal constant syntax
3036    described in <xref linkend="sql-syntax-constants-generic"/>, for
3037    example <literal>'yes'::boolean</literal>.
3038   </para>
3039
3040   <para>
3041    Note that the parser automatically understands
3042    that <literal>TRUE</literal> and <literal>FALSE</literal> are of
3043    type <type>boolean</type>, but this is not so
3044    for <literal>NULL</literal> because that can have any type.
3045    So in some contexts you might have to cast <literal>NULL</literal>
3046    to <type>boolean</type> explicitly, for
3047    example <literal>NULL::boolean</literal>.  Conversely, the cast can be
3048    omitted from a string-literal Boolean value in contexts where the parser
3049    can deduce that the literal must be of type <type>boolean</type>.
3050   </para>
3051  </sect1>
3052
3053  <sect1 id="datatype-enum">
3054   <title>Enumerated Types</title>
3055
3056   <indexterm zone="datatype-enum">
3057    <primary>data type</primary>
3058    <secondary>enumerated (enum)</secondary>
3059   </indexterm>
3060
3061   <indexterm zone="datatype-enum">
3062    <primary>enumerated types</primary>
3063   </indexterm>
3064
3065   <para>
3066    Enumerated (enum) types are data types that
3067    comprise a static, ordered set of values.
3068    They are equivalent to the <type>enum</type>
3069    types supported in a number of programming languages. An example of an enum
3070    type might be the days of the week, or a set of status values for
3071    a piece of data.
3072   </para>
3073
3074   <sect2>
3075    <title>Declaration of Enumerated Types</title>
3076
3077    <para>
3078     Enum types are created using the <xref
3079     linkend="sql-createtype"/> command,
3080     for example:
3081
3082<programlisting>
3083CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
3084</programlisting>
3085
3086     Once created, the enum type can be used in table and function
3087     definitions much like any other type:
3088<programlisting>
3089CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
3090CREATE TABLE person (
3091    name text,
3092    current_mood mood
3093);
3094INSERT INTO person VALUES ('Moe', 'happy');
3095SELECT * FROM person WHERE current_mood = 'happy';
3096 name | current_mood
3097------+--------------
3098 Moe  | happy
3099(1 row)
3100</programlisting>
3101    </para>
3102    </sect2>
3103
3104    <sect2>
3105     <title>Ordering</title>
3106
3107     <para>
3108      The ordering of the values in an enum type is the
3109      order in which the values were listed when the type was created.
3110      All standard comparison operators and related
3111      aggregate functions are supported for enums.  For example:
3112
3113<programlisting>
3114INSERT INTO person VALUES ('Larry', 'sad');
3115INSERT INTO person VALUES ('Curly', 'ok');
3116SELECT * FROM person WHERE current_mood > 'sad';
3117 name  | current_mood
3118-------+--------------
3119 Moe   | happy
3120 Curly | ok
3121(2 rows)
3122
3123SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
3124 name  | current_mood
3125-------+--------------
3126 Curly | ok
3127 Moe   | happy
3128(2 rows)
3129
3130SELECT name
3131FROM person
3132WHERE current_mood = (SELECT MIN(current_mood) FROM person);
3133 name
3134-------
3135 Larry
3136(1 row)
3137</programlisting>
3138     </para>
3139   </sect2>
3140
3141   <sect2>
3142    <title>Type Safety</title>
3143
3144    <para>
3145     Each enumerated data type is separate and cannot
3146     be compared with other enumerated types.  See this example:
3147
3148<programlisting>
3149CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
3150CREATE TABLE holidays (
3151    num_weeks integer,
3152    happiness happiness
3153);
3154INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
3155INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');
3156INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');
3157INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');
3158ERROR:  invalid input value for enum happiness: "sad"
3159SELECT person.name, holidays.num_weeks FROM person, holidays
3160  WHERE person.current_mood = holidays.happiness;
3161ERROR:  operator does not exist: mood = happiness
3162</programlisting>
3163    </para>
3164
3165    <para>
3166     If you really need to do something like that, you can either
3167     write a custom operator or add explicit casts to your query:
3168
3169<programlisting>
3170SELECT person.name, holidays.num_weeks FROM person, holidays
3171  WHERE person.current_mood::text = holidays.happiness::text;
3172 name | num_weeks
3173------+-----------
3174 Moe  |         4
3175(1 row)
3176
3177</programlisting>
3178    </para>
3179   </sect2>
3180
3181   <sect2>
3182    <title>Implementation Details</title>
3183
3184    <para>
3185     Enum labels are case sensitive, so
3186     <type>'happy'</type> is not the same as <type>'HAPPY'</type>.
3187     White space in the labels is significant too.
3188    </para>
3189
3190    <para>
3191     Although enum types are primarily intended for static sets of values,
3192     there is support for adding new values to an existing enum type, and for
3193     renaming values (see <xref linkend="sql-altertype"/>).  Existing values
3194     cannot be removed from an enum type, nor can the sort ordering of such
3195     values be changed, short of dropping and re-creating the enum type.
3196    </para>
3197
3198    <para>
3199     An enum value occupies four bytes on disk.  The length of an enum
3200     value's textual label is limited by the <symbol>NAMEDATALEN</symbol>
3201     setting compiled into <productname>PostgreSQL</productname>; in standard
3202     builds this means at most 63 bytes.
3203    </para>
3204
3205    <para>
3206     The translations from internal enum values to textual labels are
3207     kept in the system catalog
3208     <link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>.
3209     Querying this catalog directly can be useful.
3210    </para>
3211
3212   </sect2>
3213  </sect1>
3214
3215  <sect1 id="datatype-geometric">
3216   <title>Geometric Types</title>
3217
3218   <para>
3219    Geometric data types represent two-dimensional spatial
3220    objects. <xref linkend="datatype-geo-table"/> shows the geometric
3221    types available in <productname>PostgreSQL</productname>.
3222   </para>
3223
3224    <table id="datatype-geo-table">
3225     <title>Geometric Types</title>
3226     <tgroup cols="4">
3227      <colspec colname="col1" colwidth="1*"/>
3228      <colspec colname="col2" colwidth="1*"/>
3229      <colspec colname="col3" colwidth="2*"/>
3230      <colspec colname="col4" colwidth="1*"/>
3231      <thead>
3232       <row>
3233        <entry>Name</entry>
3234        <entry>Storage Size</entry>
3235        <entry>Description</entry>
3236        <entry>Representation</entry>
3237       </row>
3238      </thead>
3239      <tbody>
3240       <row>
3241        <entry><type>point</type></entry>
3242        <entry>16 bytes</entry>
3243        <entry>Point on a plane</entry>
3244        <entry>(x,y)</entry>
3245       </row>
3246       <row>
3247        <entry><type>line</type></entry>
3248        <entry>32 bytes</entry>
3249        <entry>Infinite line</entry>
3250        <entry>{A,B,C}</entry>
3251       </row>
3252       <row>
3253        <entry><type>lseg</type></entry>
3254        <entry>32 bytes</entry>
3255        <entry>Finite line segment</entry>
3256        <entry>((x1,y1),(x2,y2))</entry>
3257       </row>
3258       <row>
3259        <entry><type>box</type></entry>
3260        <entry>32 bytes</entry>
3261        <entry>Rectangular box</entry>
3262        <entry>((x1,y1),(x2,y2))</entry>
3263       </row>
3264       <row>
3265        <entry><type>path</type></entry>
3266        <entry>16+16n bytes</entry>
3267        <entry>Closed path (similar to polygon)</entry>
3268        <entry>((x1,y1),...)</entry>
3269       </row>
3270       <row>
3271        <entry><type>path</type></entry>
3272        <entry>16+16n bytes</entry>
3273        <entry>Open path</entry>
3274        <entry>[(x1,y1),...]</entry>
3275       </row>
3276       <row>
3277        <entry><type>polygon</type></entry>
3278        <entry>40+16n bytes</entry>
3279        <entry>Polygon (similar to closed path)</entry>
3280        <entry>((x1,y1),...)</entry>
3281       </row>
3282       <row>
3283        <entry><type>circle</type></entry>
3284        <entry>24 bytes</entry>
3285        <entry>Circle</entry>
3286        <entry>&lt;(x,y),r&gt; (center point and radius)</entry>
3287       </row>
3288      </tbody>
3289     </tgroup>
3290    </table>
3291
3292   <para>
3293    A rich set of functions and operators is available to perform various geometric
3294    operations such as scaling, translation, rotation, and determining
3295    intersections.  They are explained in <xref linkend="functions-geometry"/>.
3296   </para>
3297
3298   <sect2>
3299    <title>Points</title>
3300
3301    <indexterm>
3302     <primary>point</primary>
3303    </indexterm>
3304
3305    <para>
3306     Points are the fundamental two-dimensional building block for geometric
3307     types.  Values of type <type>point</type> are specified using either of
3308     the following syntaxes:
3309
3310<synopsis>
3311( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
3312  <replaceable>x</replaceable> , <replaceable>y</replaceable>
3313</synopsis>
3314
3315     where <replaceable>x</replaceable> and <replaceable>y</replaceable> are the respective
3316     coordinates, as floating-point numbers.
3317    </para>
3318
3319    <para>
3320     Points are output using the first syntax.
3321    </para>
3322   </sect2>
3323
3324   <sect2 id="datatype-line">
3325    <title>Lines</title>
3326
3327    <indexterm>
3328     <primary>line</primary>
3329    </indexterm>
3330
3331    <para>
3332     Lines are represented by the linear
3333     equation <replaceable>A</replaceable>x + <replaceable>B</replaceable>y + <replaceable>C</replaceable> = 0,
3334     where <replaceable>A</replaceable> and <replaceable>B</replaceable> are not both zero.  Values
3335     of type <type>line</type> are input and output in the following form:
3336<synopsis>
3337{ <replaceable>A</replaceable>, <replaceable>B</replaceable>, <replaceable>C</replaceable> }
3338</synopsis>
3339
3340     Alternatively, any of the following forms can be used for input:
3341
3342<synopsis>
3343[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ]
3344( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3345  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3346    <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3347</synopsis>
3348
3349     where
3350     <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3351     and
3352     <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3353     are two different points on the line.
3354    </para>
3355   </sect2>
3356
3357   <sect2 id="datatype-lseg">
3358    <title>Line Segments</title>
3359
3360    <indexterm>
3361     <primary>lseg</primary>
3362    </indexterm>
3363
3364    <indexterm>
3365     <primary>line segment</primary>
3366    </indexterm>
3367
3368    <para>
3369     Line segments are represented by pairs of points that are the endpoints
3370     of the segment.  Values of type <type>lseg</type> are specified using any
3371     of the following syntaxes:
3372
3373<synopsis>
3374[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ]
3375( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3376  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3377    <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3378</synopsis>
3379
3380     where
3381     <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3382     and
3383     <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3384     are the end points of the line segment.
3385    </para>
3386
3387    <para>
3388     Line segments are output using the first syntax.
3389    </para>
3390   </sect2>
3391
3392   <sect2>
3393    <title>Boxes</title>
3394
3395    <indexterm>
3396     <primary>box (data type)</primary>
3397    </indexterm>
3398
3399    <indexterm>
3400     <primary>rectangle</primary>
3401    </indexterm>
3402
3403    <para>
3404     Boxes are represented by pairs of points that are opposite
3405     corners of the box.
3406     Values of type <type>box</type> are specified using any of the following
3407     syntaxes:
3408
3409<synopsis>
3410( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3411  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3412    <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3413</synopsis>
3414
3415     where
3416     <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3417     and
3418     <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3419     are any two opposite corners of the box.
3420    </para>
3421
3422    <para>
3423     Boxes are output using the second syntax.
3424    </para>
3425
3426    <para>
3427     Any two opposite corners can be supplied on input, but the values
3428     will be reordered as needed to store the
3429     upper right and lower left corners, in that order.
3430    </para>
3431   </sect2>
3432
3433   <sect2>
3434    <title>Paths</title>
3435
3436    <indexterm>
3437     <primary>path (data type)</primary>
3438    </indexterm>
3439
3440    <para>
3441     Paths are represented by lists of connected points. Paths can be
3442     <firstterm>open</firstterm>, where
3443     the first and last points in the list are considered not connected, or
3444     <firstterm>closed</firstterm>,
3445     where the first and last points are considered connected.
3446    </para>
3447
3448    <para>
3449     Values of type <type>path</type> are specified using any of the following
3450     syntaxes:
3451
3452<synopsis>
3453[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]
3454( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3455  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3456  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3457    <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3458</synopsis>
3459
3460     where the points are the end points of the line segments
3461     comprising the path.  Square brackets (<literal>[]</literal>) indicate
3462     an open path, while parentheses (<literal>()</literal>) indicate a
3463     closed path.  When the outermost parentheses are omitted, as
3464     in the third through fifth syntaxes, a closed path is assumed.
3465    </para>
3466
3467    <para>
3468     Paths are output using the first or second syntax, as appropriate.
3469    </para>
3470   </sect2>
3471
3472   <sect2 id="datatype-polygon">
3473    <title>Polygons</title>
3474
3475    <indexterm>
3476     <primary>polygon</primary>
3477    </indexterm>
3478
3479    <para>
3480     Polygons are represented by lists of points (the vertexes of the
3481     polygon). Polygons are very similar to closed paths, but are
3482     stored differently and have their own set of support routines.
3483    </para>
3484
3485    <para>
3486     Values of type <type>polygon</type> are specified using any of the
3487     following syntaxes:
3488
3489<synopsis>
3490( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3491  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3492  ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3493    <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3494</synopsis>
3495
3496     where the points are the end points of the line segments
3497     comprising the boundary of the polygon.
3498    </para>
3499
3500    <para>
3501     Polygons are output using the first syntax.
3502    </para>
3503   </sect2>
3504
3505   <sect2 id="datatype-circle">
3506    <title>Circles</title>
3507
3508    <indexterm>
3509     <primary>circle</primary>
3510    </indexterm>
3511
3512    <para>
3513     Circles are represented by a center point and radius.
3514     Values of type <type>circle</type> are specified using any of the
3515     following syntaxes:
3516
3517<synopsis>
3518&lt; ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> &gt;
3519( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )
3520  ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>
3521    <replaceable>x</replaceable> , <replaceable>y</replaceable>   , <replaceable>r</replaceable>
3522</synopsis>
3523
3524     where
3525     <literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
3526     is the center point and <replaceable>r</replaceable> is the radius of the
3527     circle.
3528    </para>
3529
3530    <para>
3531     Circles are output using the first syntax.
3532    </para>
3533   </sect2>
3534
3535  </sect1>
3536
3537  <sect1 id="datatype-net-types">
3538   <title>Network Address Types</title>
3539
3540   <indexterm zone="datatype-net-types">
3541    <primary>network</primary>
3542    <secondary>data types</secondary>
3543   </indexterm>
3544
3545   <para>
3546    <productname>PostgreSQL</productname> offers data types to store IPv4, IPv6, and MAC
3547    addresses, as shown in <xref linkend="datatype-net-types-table"/>.  It
3548    is better to use these types instead of plain text types to store
3549    network addresses, because
3550    these types offer input error checking and specialized
3551    operators and functions (see <xref linkend="functions-net"/>).
3552   </para>
3553
3554    <table tocentry="1" id="datatype-net-types-table">
3555     <title>Network Address Types</title>
3556     <tgroup cols="3">
3557      <colspec colname="col1" colwidth="1*"/>
3558      <colspec colname="col2" colwidth="1*"/>
3559      <colspec colname="col3" colwidth="2*"/>
3560      <thead>
3561       <row>
3562        <entry>Name</entry>
3563        <entry>Storage Size</entry>
3564        <entry>Description</entry>
3565       </row>
3566      </thead>
3567      <tbody>
3568
3569       <row>
3570        <entry><type>cidr</type></entry>
3571        <entry>7 or 19 bytes</entry>
3572        <entry>IPv4 and IPv6 networks</entry>
3573       </row>
3574
3575       <row>
3576        <entry><type>inet</type></entry>
3577        <entry>7 or 19 bytes</entry>
3578        <entry>IPv4 and IPv6 hosts and networks</entry>
3579       </row>
3580
3581       <row>
3582        <entry><type>macaddr</type></entry>
3583        <entry>6 bytes</entry>
3584        <entry>MAC addresses</entry>
3585       </row>
3586
3587       <row>
3588        <entry><type>macaddr8</type></entry>
3589        <entry>8 bytes</entry>
3590        <entry>MAC addresses (EUI-64 format)</entry>
3591       </row>
3592
3593      </tbody>
3594     </tgroup>
3595    </table>
3596
3597   <para>
3598    When sorting <type>inet</type> or <type>cidr</type> data types,
3599    IPv4 addresses will always sort before IPv6 addresses, including
3600    IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
3601    ::10.2.3.4 or ::ffff:10.4.3.2.
3602   </para>
3603
3604
3605   <sect2 id="datatype-inet">
3606    <title><type>inet</type></title>
3607
3608    <indexterm>
3609     <primary>inet (data type)</primary>
3610    </indexterm>
3611
3612    <para>
3613     The <type>inet</type> type holds an IPv4 or IPv6 host address, and
3614     optionally its subnet, all in one field.
3615     The subnet is represented by the number of network address bits
3616     present in the host address (the
3617     <quote>netmask</quote>).  If the netmask is 32 and the address is IPv4,
3618     then the value does not indicate a subnet, only a single host.
3619     In IPv6, the address length is 128 bits, so 128 bits specify a
3620     unique host address.  Note that if you
3621     want to accept only networks, you should use the
3622     <type>cidr</type> type rather than <type>inet</type>.
3623    </para>
3624
3625    <para>
3626      The input format for this type is
3627      <replaceable class="parameter">address/y</replaceable>
3628      where
3629      <replaceable class="parameter">address</replaceable>
3630      is an IPv4 or IPv6 address and
3631      <replaceable class="parameter">y</replaceable>
3632      is the number of bits in the netmask.  If the
3633      <replaceable class="parameter">/y</replaceable>
3634      portion is omitted, the
3635      netmask is taken to be 32 for IPv4 or 128 for IPv6,
3636      so the value represents
3637      just a single host.  On display, the
3638      <replaceable class="parameter">/y</replaceable>
3639      portion is suppressed if the netmask specifies a single host.
3640    </para>
3641   </sect2>
3642
3643   <sect2 id="datatype-cidr">
3644    <title><type>cidr</type></title>
3645
3646    <indexterm>
3647     <primary>cidr</primary>
3648    </indexterm>
3649
3650    <para>
3651     The <type>cidr</type> type holds an IPv4 or IPv6 network specification.
3652     Input and output formats follow Classless Internet Domain Routing
3653     conventions.
3654     The format for specifying networks is <replaceable
3655     class="parameter">address/y</replaceable> where <replaceable
3656     class="parameter">address</replaceable> is the network's lowest
3657     address represented as an
3658     IPv4 or IPv6 address, and <replaceable
3659     class="parameter">y</replaceable> is the number of bits in the netmask.  If
3660     <replaceable class="parameter">y</replaceable> is omitted, it is calculated
3661     using assumptions from the older classful network numbering system, except
3662     it will be at least large enough to include all of the octets
3663     written in the input.  It is an error to specify a network address
3664     that has bits set to the right of the specified netmask.
3665    </para>
3666
3667    <para>
3668     <xref linkend="datatype-net-cidr-table"/> shows some examples.
3669    </para>
3670
3671     <table id="datatype-net-cidr-table">
3672      <title><type>cidr</type> Type Input Examples</title>
3673      <tgroup cols="3">
3674       <thead>
3675        <row>
3676         <entry><type>cidr</type> Input</entry>
3677         <entry><type>cidr</type> Output</entry>
3678         <entry><literal><function>abbrev(<type>cidr</type>)</function></literal></entry>
3679        </row>
3680       </thead>
3681       <tbody>
3682        <row>
3683         <entry>192.168.100.128/25</entry>
3684         <entry>192.168.100.128/25</entry>
3685         <entry>192.168.100.128/25</entry>
3686        </row>
3687        <row>
3688         <entry>192.168/24</entry>
3689         <entry>192.168.0.0/24</entry>
3690         <entry>192.168.0/24</entry>
3691        </row>
3692        <row>
3693         <entry>192.168/25</entry>
3694         <entry>192.168.0.0/25</entry>
3695         <entry>192.168.0.0/25</entry>
3696        </row>
3697        <row>
3698         <entry>192.168.1</entry>
3699         <entry>192.168.1.0/24</entry>
3700         <entry>192.168.1/24</entry>
3701        </row>
3702        <row>
3703         <entry>192.168</entry>
3704         <entry>192.168.0.0/24</entry>
3705         <entry>192.168.0/24</entry>
3706        </row>
3707        <row>
3708         <entry>128.1</entry>
3709         <entry>128.1.0.0/16</entry>
3710         <entry>128.1/16</entry>
3711        </row>
3712        <row>
3713         <entry>128</entry>
3714         <entry>128.0.0.0/16</entry>
3715         <entry>128.0/16</entry>
3716        </row>
3717        <row>
3718         <entry>128.1.2</entry>
3719         <entry>128.1.2.0/24</entry>
3720         <entry>128.1.2/24</entry>
3721        </row>
3722        <row>
3723         <entry>10.1.2</entry>
3724         <entry>10.1.2.0/24</entry>
3725         <entry>10.1.2/24</entry>
3726        </row>
3727        <row>
3728         <entry>10.1</entry>
3729         <entry>10.1.0.0/16</entry>
3730         <entry>10.1/16</entry>
3731        </row>
3732        <row>
3733         <entry>10</entry>
3734         <entry>10.0.0.0/8</entry>
3735         <entry>10/8</entry>
3736        </row>
3737        <row>
3738         <entry>10.1.2.3/32</entry>
3739         <entry>10.1.2.3/32</entry>
3740         <entry>10.1.2.3/32</entry>
3741        </row>
3742        <row>
3743         <entry>2001:4f8:3:ba::/64</entry>
3744         <entry>2001:4f8:3:ba::/64</entry>
3745         <entry>2001:4f8:3:ba/64</entry>
3746        </row>
3747        <row>
3748         <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry>
3749         <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry>
3750         <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry>
3751        </row>
3752        <row>
3753         <entry>::ffff:1.2.3.0/120</entry>
3754         <entry>::ffff:1.2.3.0/120</entry>
3755         <entry>::ffff:1.2.3/120</entry>
3756        </row>
3757        <row>
3758         <entry>::ffff:1.2.3.0/128</entry>
3759         <entry>::ffff:1.2.3.0/128</entry>
3760         <entry>::ffff:1.2.3.0/128</entry>
3761        </row>
3762       </tbody>
3763      </tgroup>
3764     </table>
3765   </sect2>
3766
3767   <sect2 id="datatype-inet-vs-cidr">
3768    <title><type>inet</type> vs. <type>cidr</type></title>
3769
3770    <para>
3771    The essential difference between <type>inet</type> and <type>cidr</type>
3772    data types is that <type>inet</type> accepts values with nonzero bits to
3773    the right of the netmask, whereas <type>cidr</type> does not.  For
3774    example, <literal>192.168.0.1/24</literal> is valid for <type>inet</type>
3775    but not for <type>cidr</type>.
3776    </para>
3777
3778      <tip>
3779        <para>
3780        If you do not like the output format for <type>inet</type> or
3781        <type>cidr</type> values, try the functions <function>host</function>,
3782        <function>text</function>, and <function>abbrev</function>.
3783        </para>
3784      </tip>
3785   </sect2>
3786
3787   <sect2 id="datatype-macaddr">
3788    <title><type>macaddr</type></title>
3789
3790    <indexterm>
3791     <primary>macaddr (data type)</primary>
3792    </indexterm>
3793
3794    <indexterm>
3795     <primary>MAC address</primary>
3796     <see>macaddr</see>
3797    </indexterm>
3798
3799    <para>
3800     The <type>macaddr</type> type stores MAC addresses, known for example
3801     from Ethernet card hardware addresses (although MAC addresses are
3802     used for other purposes as well).  Input is accepted in the
3803     following formats:
3804
3805     <simplelist>
3806      <member><literal>'08:00:2b:01:02:03'</literal></member>
3807      <member><literal>'08-00-2b-01-02-03'</literal></member>
3808      <member><literal>'08002b:010203'</literal></member>
3809      <member><literal>'08002b-010203'</literal></member>
3810      <member><literal>'0800.2b01.0203'</literal></member>
3811      <member><literal>'0800-2b01-0203'</literal></member>
3812      <member><literal>'08002b010203'</literal></member>
3813     </simplelist>
3814
3815     These examples all specify the same address.  Upper and
3816     lower case is accepted for the digits
3817     <literal>a</literal> through <literal>f</literal>.  Output is always in the
3818     first of the forms shown.
3819    </para>
3820
3821    <para>
3822     IEEE Std 802-2001 specifies the second shown form (with hyphens)
3823     as the canonical form for MAC addresses, and specifies the first
3824     form (with colons) as the bit-reversed notation, so that
3825     08-00-2b-01-02-03 = 01:00:4D:08:04:0C.  This convention is widely
3826     ignored nowadays, and it is relevant only for obsolete network
3827     protocols (such as Token Ring).  PostgreSQL makes no provisions
3828     for bit reversal, and all accepted formats use the canonical LSB
3829     order.
3830    </para>
3831
3832    <para>
3833     The remaining five input formats are not part of any standard.
3834    </para>
3835   </sect2>
3836
3837   <sect2 id="datatype-macaddr8">
3838    <title><type>macaddr8</type></title>
3839
3840    <indexterm>
3841     <primary>macaddr8 (data type)</primary>
3842    </indexterm>
3843
3844    <indexterm>
3845     <primary>MAC address (EUI-64 format)</primary>
3846     <see>macaddr</see>
3847    </indexterm>
3848
3849    <para>
3850     The <type>macaddr8</type> type stores MAC addresses in EUI-64
3851     format, known for example from Ethernet card hardware addresses
3852     (although MAC addresses are used for other purposes as well).
3853     This type can accept both 6 and 8 byte length MAC addresses
3854     and stores them in 8 byte length format.  MAC addresses given
3855     in 6 byte format will be stored in 8 byte length format with the
3856     4th and 5th bytes set to FF and FE, respectively.
3857
3858     Note that IPv6 uses a modified EUI-64 format where the 7th bit
3859     should be set to one after the conversion from EUI-48.  The
3860     function <function>macaddr8_set7bit</function> is provided to make this
3861     change.
3862
3863     Generally speaking, any input which is comprised of pairs of hex
3864     digits (on byte boundaries), optionally separated consistently by
3865     one of <literal>':'</literal>, <literal>'-'</literal> or <literal>'.'</literal>, is
3866     accepted.  The number of hex digits must be either 16 (8 bytes) or
3867     12 (6 bytes).  Leading and trailing whitespace is ignored.
3868
3869     The following are examples of input formats that are accepted:
3870
3871     <simplelist>
3872      <member><literal>'08:00:2b:01:02:03:04:05'</literal></member>
3873      <member><literal>'08-00-2b-01-02-03-04-05'</literal></member>
3874      <member><literal>'08002b:0102030405'</literal></member>
3875      <member><literal>'08002b-0102030405'</literal></member>
3876      <member><literal>'0800.2b01.0203.0405'</literal></member>
3877      <member><literal>'0800-2b01-0203-0405'</literal></member>
3878      <member><literal>'08002b01:02030405'</literal></member>
3879      <member><literal>'08002b0102030405'</literal></member>
3880     </simplelist>
3881
3882     These examples all specify the same address.  Upper and
3883     lower case is accepted for the digits
3884     <literal>a</literal> through <literal>f</literal>.  Output is always in the
3885     first of the forms shown.
3886    </para>
3887
3888    <para>
3889     The last six input formats shown above are not part of any standard.
3890    </para>
3891
3892    <para>
3893     To convert a traditional 48 bit MAC address in EUI-48 format to
3894     modified EUI-64 format to be included as the host portion of an
3895     IPv6 address, use <function>macaddr8_set7bit</function> as shown:
3896
3897<programlisting>
3898SELECT macaddr8_set7bit('08:00:2b:01:02:03');
3899<computeroutput>
3900    macaddr8_set7bit
3901-------------------------
3902 0a:00:2b:ff:fe:01:02:03
3903(1 row)
3904</computeroutput>
3905</programlisting>
3906
3907    </para>
3908
3909   </sect2>
3910
3911  </sect1>
3912
3913  <sect1 id="datatype-bit">
3914   <title>Bit String Types</title>
3915
3916   <indexterm zone="datatype-bit">
3917    <primary>bit string</primary>
3918    <secondary>data type</secondary>
3919   </indexterm>
3920
3921   <para>
3922    Bit strings are strings of 1's and 0's.  They can be used to store
3923    or visualize bit masks.  There are two SQL bit types:
3924    <type>bit(<replaceable>n</replaceable>)</type> and <type>bit
3925    varying(<replaceable>n</replaceable>)</type>, where
3926    <replaceable>n</replaceable> is a positive integer.
3927   </para>
3928
3929   <para>
3930    <type>bit</type> type data must match the length
3931    <replaceable>n</replaceable> exactly; it is an error to attempt to
3932    store shorter or longer bit strings.  <type>bit varying</type> data is
3933    of variable length up to the maximum length
3934    <replaceable>n</replaceable>; longer strings will be rejected.
3935    Writing <type>bit</type> without a length is equivalent to
3936    <literal>bit(1)</literal>, while <type>bit varying</type> without a length
3937    specification means unlimited length.
3938   </para>
3939
3940   <note>
3941    <para>
3942     If one explicitly casts a bit-string value to
3943     <type>bit(<replaceable>n</replaceable>)</type>, it will be truncated or
3944     zero-padded on the right to be exactly <replaceable>n</replaceable> bits,
3945     without raising an error.  Similarly,
3946     if one explicitly casts a bit-string value to
3947     <type>bit varying(<replaceable>n</replaceable>)</type>, it will be truncated
3948     on the right if it is more than <replaceable>n</replaceable> bits.
3949    </para>
3950   </note>
3951
3952   <para>
3953    Refer to <xref
3954    linkend="sql-syntax-bit-strings"/> for information about the syntax
3955    of bit string constants.  Bit-logical operators and string
3956    manipulation functions are available; see <xref
3957    linkend="functions-bitstring"/>.
3958   </para>
3959
3960   <example>
3961    <title>Using the Bit String Types</title>
3962
3963<programlisting>
3964CREATE TABLE test (a BIT(3), b BIT VARYING(5));
3965INSERT INTO test VALUES (B'101', B'00');
3966INSERT INTO test VALUES (B'10', B'101');
3967<computeroutput>
3968ERROR:  bit string length 2 does not match type bit(3)
3969</computeroutput>
3970INSERT INTO test VALUES (B'10'::bit(3), B'101');
3971SELECT * FROM test;
3972<computeroutput>
3973  a  |  b
3974-----+-----
3975 101 | 00
3976 100 | 101
3977</computeroutput>
3978</programlisting>
3979   </example>
3980
3981   <para>
3982    A bit string value requires 1 byte for each group of 8 bits, plus
3983    5 or 8 bytes overhead depending on the length of the string
3984    (but long values may be compressed or moved out-of-line, as explained
3985    in <xref linkend="datatype-character"/> for character strings).
3986   </para>
3987  </sect1>
3988
3989  <sect1 id="datatype-textsearch">
3990   <title>Text Search Types</title>
3991
3992   <indexterm zone="datatype-textsearch">
3993    <primary>full text search</primary>
3994    <secondary>data types</secondary>
3995   </indexterm>
3996
3997   <indexterm zone="datatype-textsearch">
3998    <primary>text search</primary>
3999    <secondary>data types</secondary>
4000   </indexterm>
4001
4002   <para>
4003    <productname>PostgreSQL</productname> provides two data types that
4004    are designed to support full text search, which is the activity of
4005    searching through a collection of natural-language <firstterm>documents</firstterm>
4006    to locate those that best match a <firstterm>query</firstterm>.
4007    The <type>tsvector</type> type represents a document in a form optimized
4008    for text search; the <type>tsquery</type> type similarly represents
4009    a text query.
4010    <xref linkend="textsearch"/> provides a detailed explanation of this
4011    facility, and <xref linkend="functions-textsearch"/> summarizes the
4012    related functions and operators.
4013   </para>
4014
4015   <sect2 id="datatype-tsvector">
4016    <title><type>tsvector</type></title>
4017
4018    <indexterm>
4019     <primary>tsvector (data type)</primary>
4020    </indexterm>
4021
4022    <para>
4023     A <type>tsvector</type> value is a sorted list of distinct
4024     <firstterm>lexemes</firstterm>, which are words that have been
4025     <firstterm>normalized</firstterm> to merge different variants of the same word
4026     (see <xref linkend="textsearch"/> for details).  Sorting and
4027     duplicate-elimination are done automatically during input, as shown in
4028     this example:
4029
4030<programlisting>
4031SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
4032                      tsvector
4033----------------------------------------------------
4034 'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
4035</programlisting>
4036
4037     To represent
4038     lexemes containing whitespace or punctuation, surround them with quotes:
4039
4040<programlisting>
4041SELECT $$the lexeme '    ' contains spaces$$::tsvector;
4042                 tsvector
4043-------------------------------------------
4044 '    ' 'contains' 'lexeme' 'spaces' 'the'
4045</programlisting>
4046
4047     (We use dollar-quoted string literals in this example and the next one
4048     to avoid the confusion of having to double quote marks within the
4049     literals.)  Embedded quotes and backslashes must be doubled:
4050
4051<programlisting>
4052SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
4053                    tsvector
4054------------------------------------------------
4055 'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
4056</programlisting>
4057
4058     Optionally, integer <firstterm>positions</firstterm>
4059     can be attached to lexemes:
4060
4061<programlisting>
4062SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
4063                                  tsvector
4064-------------------------------------------------------------------&zwsp;------------
4065 'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
4066</programlisting>
4067
4068     A position normally indicates the source word's location in the
4069     document.  Positional information can be used for
4070     <firstterm>proximity ranking</firstterm>.  Position values can
4071     range from 1 to 16383; larger numbers are silently set to 16383.
4072     Duplicate positions for the same lexeme are discarded.
4073    </para>
4074
4075    <para>
4076     Lexemes that have positions can further be labeled with a
4077     <firstterm>weight</firstterm>, which can be <literal>A</literal>,
4078     <literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.
4079     <literal>D</literal> is the default and hence is not shown on output:
4080
4081<programlisting>
4082SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
4083          tsvector
4084----------------------------
4085 'a':1A 'cat':5 'fat':2B,4C
4086</programlisting>
4087
4088     Weights are typically used to reflect document structure, for example
4089     by marking title words differently from body words.  Text search
4090     ranking functions can assign different priorities to the different
4091     weight markers.
4092    </para>
4093
4094    <para>
4095     It is important to understand that the
4096     <type>tsvector</type> type itself does not perform any word
4097     normalization; it assumes the words it is given are normalized
4098     appropriately for the application.  For example,
4099
4100<programlisting>
4101SELECT 'The Fat Rats'::tsvector;
4102      tsvector
4103--------------------
4104 'Fat' 'Rats' 'The'
4105</programlisting>
4106
4107     For most English-text-searching applications the above words would
4108     be considered non-normalized, but <type>tsvector</type> doesn't care.
4109     Raw document text should usually be passed through
4110     <function>to_tsvector</function> to normalize the words appropriately
4111     for searching:
4112
4113<programlisting>
4114SELECT to_tsvector('english', 'The Fat Rats');
4115   to_tsvector
4116-----------------
4117 'fat':2 'rat':3
4118</programlisting>
4119
4120     Again, see <xref linkend="textsearch"/> for more detail.
4121    </para>
4122
4123   </sect2>
4124
4125   <sect2 id="datatype-tsquery">
4126    <title><type>tsquery</type></title>
4127
4128    <indexterm>
4129     <primary>tsquery (data type)</primary>
4130    </indexterm>
4131
4132    <para>
4133     A <type>tsquery</type> value stores lexemes that are to be
4134     searched for, and can combine them using the Boolean operators
4135     <literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
4136     <literal>!</literal> (NOT), as well as the phrase search operator
4137     <literal>&lt;-&gt;</literal> (FOLLOWED BY).  There is also a variant
4138     <literal>&lt;<replaceable>N</replaceable>&gt;</literal> of the FOLLOWED BY
4139     operator, where <replaceable>N</replaceable> is an integer constant that
4140     specifies the distance between the two lexemes being searched
4141     for.  <literal>&lt;-&gt;</literal> is equivalent to <literal>&lt;1&gt;</literal>.
4142    </para>
4143
4144    <para>
4145     Parentheses can be used to enforce grouping of these operators.
4146     In the absence of parentheses, <literal>!</literal> (NOT) binds most tightly,
4147     <literal>&lt;-&gt;</literal> (FOLLOWED BY) next most tightly, then
4148     <literal>&amp;</literal> (AND), with <literal>|</literal> (OR) binding
4149     the least tightly.
4150    </para>
4151
4152    <para>
4153     Here are some examples:
4154
4155<programlisting>
4156SELECT 'fat &amp; rat'::tsquery;
4157    tsquery
4158---------------
4159 'fat' &amp; 'rat'
4160
4161SELECT 'fat &amp; (rat | cat)'::tsquery;
4162          tsquery
4163---------------------------
4164 'fat' &amp; ( 'rat' | 'cat' )
4165
4166SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
4167        tsquery
4168------------------------
4169 'fat' &amp; 'rat' &amp; !'cat'
4170</programlisting>
4171    </para>
4172
4173    <para>
4174     Optionally, lexemes in a <type>tsquery</type> can be labeled with
4175     one or more weight letters, which restricts them to match only
4176     <type>tsvector</type> lexemes with one of those weights:
4177
4178<programlisting>
4179SELECT 'fat:ab &amp; cat'::tsquery;
4180    tsquery
4181------------------
4182 'fat':AB &amp; 'cat'
4183</programlisting>
4184    </para>
4185
4186    <para>
4187     Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</literal>
4188     to specify prefix matching:
4189<programlisting>
4190SELECT 'super:*'::tsquery;
4191  tsquery
4192-----------
4193 'super':*
4194</programlisting>
4195     This query will match any word in a <type>tsvector</type> that begins
4196     with <quote>super</quote>.
4197    </para>
4198
4199    <para>
4200     Quoting rules for lexemes are the same as described previously for
4201     lexemes in <type>tsvector</type>; and, as with <type>tsvector</type>,
4202     any required normalization of words must be done before converting
4203     to the <type>tsquery</type> type.  The <function>to_tsquery</function>
4204     function is convenient for performing such normalization:
4205
4206<programlisting>
4207SELECT to_tsquery('Fat:ab &amp; Cats');
4208    to_tsquery
4209------------------
4210 'fat':AB &amp; 'cat'
4211</programlisting>
4212
4213     Note that <function>to_tsquery</function> will process prefixes in the same way
4214     as other words, which means this comparison returns true:
4215
4216<programlisting>
4217SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
4218 ?column?
4219----------
4220 t
4221</programlisting>
4222     because <literal>postgres</literal> gets stemmed to <literal>postgr</literal>:
4223<programlisting>
4224SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' );
4225  to_tsvector  | to_tsquery
4226---------------+------------
4227 'postgradu':1 | 'postgr':*
4228</programlisting>
4229     which will match the stemmed form of <literal>postgraduate</literal>.
4230    </para>
4231
4232   </sect2>
4233
4234  </sect1>
4235
4236  <sect1 id="datatype-uuid">
4237   <title><acronym>UUID</acronym> Type</title>
4238
4239   <indexterm zone="datatype-uuid">
4240    <primary>UUID</primary>
4241   </indexterm>
4242
4243   <para>
4244    The data type <type>uuid</type> stores Universally Unique Identifiers
4245    (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
4246    (Some systems refer to this data type as a globally unique identifier, or
4247    GUID,<indexterm><primary>GUID</primary></indexterm> instead.)  This
4248    identifier is a 128-bit quantity that is generated by an algorithm chosen
4249    to make it very unlikely that the same identifier will be generated by
4250    anyone else in the known universe using the same algorithm.  Therefore,
4251    for distributed systems, these identifiers provide a better uniqueness
4252    guarantee than sequence generators, which
4253    are only unique within a single database.
4254   </para>
4255
4256   <para>
4257    A UUID is written as a sequence of lower-case hexadecimal digits,
4258    in several groups separated by hyphens, specifically a group of 8
4259    digits followed by three groups of 4 digits followed by a group of
4260    12 digits, for a total of 32 digits representing the 128 bits.  An
4261    example of a UUID in this standard form is:
4262<programlisting>
4263a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11
4264</programlisting>
4265    <productname>PostgreSQL</productname> also accepts the following
4266    alternative forms for input:
4267    use of upper-case digits, the standard format surrounded by
4268    braces, omitting some or all hyphens, adding a hyphen after any
4269    group of four digits.  Examples are:
4270<programlisting>
4271A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
4272{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
4273a0eebc999c0b4ef8bb6d6bb9bd380a11
4274a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
4275{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}
4276</programlisting>
4277    Output is always in the standard form.
4278   </para>
4279
4280   <para>
4281    See <xref linkend="functions-uuid"/> for how to generate a UUID in
4282    <productname>PostgreSQL</productname>.
4283   </para>
4284  </sect1>
4285
4286  <sect1 id="datatype-xml">
4287   <title><acronym>XML</acronym> Type</title>
4288
4289   <indexterm zone="datatype-xml">
4290    <primary>XML</primary>
4291   </indexterm>
4292
4293   <para>
4294    The <type>xml</type> data type can be used to store XML data.  Its
4295    advantage over storing XML data in a <type>text</type> field is that it
4296    checks the input values for well-formedness, and there are support
4297    functions to perform type-safe operations on it; see <xref
4298    linkend="functions-xml"/>.  Use of this data type requires the
4299    installation to have been built with <command>configure
4300    --with-libxml</command>.
4301   </para>
4302
4303   <para>
4304    The <type>xml</type> type can store well-formed
4305    <quote>documents</quote>, as defined by the XML standard, as well
4306    as <quote>content</quote> fragments, which are defined by reference
4307    to the more permissive
4308    <ulink url="https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode"><quote>document node</quote></ulink>
4309    of the XQuery and XPath data model.
4310    Roughly, this means that content fragments can have
4311    more than one top-level element or character node.  The expression
4312    <literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
4313    can be used to evaluate whether a particular <type>xml</type>
4314    value is a full document or only a content fragment.
4315   </para>
4316
4317   <para>
4318    Limits and compatibility notes for the <type>xml</type> data type
4319    can be found in <xref linkend="xml-limits-conformance"/>.
4320   </para>
4321
4322   <sect2>
4323    <title>Creating XML Values</title>
4324   <para>
4325    To produce a value of type <type>xml</type> from character data,
4326    use the function
4327    <function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
4328<synopsis>
4329XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
4330</synopsis>
4331    Examples:
4332<programlisting><![CDATA[
4333XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
4334XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
4335]]></programlisting>
4336    While this is the only way to convert character strings into XML
4337    values according to the SQL standard, the PostgreSQL-specific
4338    syntaxes:
4339<programlisting><![CDATA[
4340xml '<foo>bar</foo>'
4341'<foo>bar</foo>'::xml
4342]]></programlisting>
4343    can also be used.
4344   </para>
4345
4346   <para>
4347    The <type>xml</type> type does not validate input values
4348    against a document type declaration
4349    (DTD),<indexterm><primary>DTD</primary></indexterm>
4350    even when the input value specifies a DTD.
4351    There is also currently no built-in support for validating against
4352    other XML schema languages such as XML Schema.
4353   </para>
4354
4355   <para>
4356    The inverse operation, producing a character string value from
4357    <type>xml</type>, uses the function
4358    <function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
4359<synopsis>
4360XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
4361</synopsis>
4362    <replaceable>type</replaceable> can be
4363    <type>character</type>, <type>character varying</type>, or
4364    <type>text</type> (or an alias for one of those).  Again, according
4365    to the SQL standard, this is the only way to convert between type
4366    <type>xml</type> and character types, but PostgreSQL also allows
4367    you to simply cast the value.
4368   </para>
4369
4370   <para>
4371    When a character string value is cast to or from type
4372    <type>xml</type> without going through <type>XMLPARSE</type> or
4373    <type>XMLSERIALIZE</type>, respectively, the choice of
4374    <literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
4375    determined by the <quote>XML option</quote>
4376    <indexterm><primary>XML option</primary></indexterm>
4377    session configuration parameter, which can be set using the
4378    standard command:
4379<synopsis>
4380SET XML OPTION { DOCUMENT | CONTENT };
4381</synopsis>
4382    or the more PostgreSQL-like syntax
4383<synopsis>
4384SET xmloption TO { DOCUMENT | CONTENT };
4385</synopsis>
4386    The default is <literal>CONTENT</literal>, so all forms of XML
4387    data are allowed.
4388   </para>
4389
4390   </sect2>
4391
4392   <sect2>
4393    <title>Encoding Handling</title>
4394   <para>
4395    Care must be taken when dealing with multiple character encodings
4396    on the client, server, and in the XML data passed through them.
4397    When using the text mode to pass queries to the server and query
4398    results to the client (which is the normal mode), PostgreSQL
4399    converts all character data passed between the client and the
4400    server and vice versa to the character encoding of the respective
4401    end; see <xref linkend="multibyte"/>.  This includes string
4402    representations of XML values, such as in the above examples.
4403    This would ordinarily mean that encoding declarations contained in
4404    XML data can become invalid as the character data is converted
4405    to other encodings while traveling between client and server,
4406    because the embedded encoding declaration is not changed.  To cope
4407    with this behavior, encoding declarations contained in
4408    character strings presented for input to the <type>xml</type> type
4409    are <emphasis>ignored</emphasis>, and content is assumed
4410    to be in the current server encoding.  Consequently, for correct
4411    processing, character strings of XML data must be sent
4412    from the client in the current client encoding.  It is the
4413    responsibility of the client to either convert documents to the
4414    current client encoding before sending them to the server, or to
4415    adjust the client encoding appropriately.  On output, values of
4416    type <type>xml</type> will not have an encoding declaration, and
4417    clients should assume all data is in the current client
4418    encoding.
4419   </para>
4420
4421   <para>
4422    When using binary mode to pass query parameters to the server
4423    and query results back to the client, no encoding conversion
4424    is performed, so the situation is different.  In this case, an
4425    encoding declaration in the XML data will be observed, and if it
4426    is absent, the data will be assumed to be in UTF-8 (as required by
4427    the XML standard; note that PostgreSQL does not support UTF-16).
4428    On output, data will have an encoding declaration
4429    specifying the client encoding, unless the client encoding is
4430    UTF-8, in which case it will be omitted.
4431   </para>
4432
4433   <para>
4434    Needless to say, processing XML data with PostgreSQL will be less
4435    error-prone and more efficient if the XML data encoding, client encoding,
4436    and server encoding are the same.  Since XML data is internally
4437    processed in UTF-8, computations will be most efficient if the
4438    server encoding is also UTF-8.
4439   </para>
4440
4441   <caution>
4442    <para>
4443     Some XML-related functions may not work at all on non-ASCII data
4444     when the server encoding is not UTF-8.  This is known to be an
4445     issue for <function>xmltable()</function> and <function>xpath()</function> in particular.
4446    </para>
4447   </caution>
4448   </sect2>
4449
4450   <sect2>
4451   <title>Accessing XML Values</title>
4452
4453   <para>
4454    The <type>xml</type> data type is unusual in that it does not
4455    provide any comparison operators.  This is because there is no
4456    well-defined and universally useful comparison algorithm for XML
4457    data.  One consequence of this is that you cannot retrieve rows by
4458    comparing an <type>xml</type> column against a search value.  XML
4459    values should therefore typically be accompanied by a separate key
4460    field such as an ID.  An alternative solution for comparing XML
4461    values is to convert them to character strings first, but note
4462    that character string comparison has little to do with a useful
4463    XML comparison method.
4464   </para>
4465
4466   <para>
4467    Since there are no comparison operators for the <type>xml</type>
4468    data type, it is not possible to create an index directly on a
4469    column of this type.  If speedy searches in XML data are desired,
4470    possible workarounds include casting the expression to a
4471    character string type and indexing that, or indexing an XPath
4472    expression.  Of course, the actual query would have to be adjusted
4473    to search by the indexed expression.
4474   </para>
4475
4476   <para>
4477    The text-search functionality in PostgreSQL can also be used to speed
4478    up full-document searches of XML data.  The necessary
4479    preprocessing support is, however, not yet available in the PostgreSQL
4480    distribution.
4481   </para>
4482   </sect2>
4483  </sect1>
4484
4485  &json;
4486
4487  &array;
4488
4489  &rowtypes;
4490
4491  &rangetypes;
4492
4493  <sect1 id="domains">
4494   <title>Domain Types</title>
4495
4496   <indexterm zone="domains">
4497    <primary>domain</primary>
4498   </indexterm>
4499
4500   <indexterm zone="domains">
4501    <primary>data type</primary>
4502    <secondary>domain</secondary>
4503   </indexterm>
4504
4505   <para>
4506    A <firstterm>domain</firstterm> is a user-defined data type that is
4507    based on another <firstterm>underlying type</firstterm>.  Optionally,
4508    it can have constraints that restrict its valid values to a subset of
4509    what the underlying type would allow.  Otherwise it behaves like the
4510    underlying type &mdash; for example, any operator or function that
4511    can be applied to the underlying type will work on the domain type.
4512    The underlying type can be any built-in or user-defined base type,
4513    enum type, array type, composite type, range type, or another domain.
4514   </para>
4515
4516   <para>
4517    For example, we could create a domain over integers that accepts only
4518    positive integers:
4519<programlisting>
4520CREATE DOMAIN posint AS integer CHECK (VALUE &gt; 0);
4521CREATE TABLE mytable (id posint);
4522INSERT INTO mytable VALUES(1);   -- works
4523INSERT INTO mytable VALUES(-1);  -- fails
4524</programlisting>
4525   </para>
4526
4527   <para>
4528    When an operator or function of the underlying type is applied to a
4529    domain value, the domain is automatically down-cast to the underlying
4530    type.  Thus, for example, the result of <literal>mytable.id - 1</literal>
4531    is considered to be of type <type>integer</type> not <type>posint</type>.
4532    We could write <literal>(mytable.id - 1)::posint</literal> to cast the
4533    result back to <type>posint</type>, causing the domain's constraints
4534    to be rechecked.  In this case, that would result in an error if the
4535    expression had been applied to an <structfield>id</structfield> value of
4536    1.  Assigning a value of the underlying type to a field or variable of
4537    the domain type is allowed without writing an explicit cast, but the
4538    domain's constraints will be checked.
4539   </para>
4540
4541   <para>
4542    For additional information see <xref linkend="sql-createdomain"/>.
4543   </para>
4544  </sect1>
4545
4546  <sect1 id="datatype-oid">
4547   <title>Object Identifier Types</title>
4548
4549   <indexterm zone="datatype-oid">
4550    <primary>object identifier</primary>
4551    <secondary>data type</secondary>
4552   </indexterm>
4553
4554   <indexterm zone="datatype-oid">
4555    <primary>oid</primary>
4556   </indexterm>
4557
4558   <indexterm zone="datatype-oid">
4559    <primary>regclass</primary>
4560   </indexterm>
4561
4562   <indexterm zone="datatype-oid">
4563    <primary>regcollation</primary>
4564   </indexterm>
4565
4566   <indexterm zone="datatype-oid">
4567    <primary>regconfig</primary>
4568   </indexterm>
4569
4570   <indexterm zone="datatype-oid">
4571    <primary>regdictionary</primary>
4572   </indexterm>
4573
4574   <indexterm zone="datatype-oid">
4575    <primary>regnamespace</primary>
4576   </indexterm>
4577
4578   <indexterm zone="datatype-oid">
4579    <primary>regoper</primary>
4580   </indexterm>
4581
4582   <indexterm zone="datatype-oid">
4583    <primary>regoperator</primary>
4584   </indexterm>
4585
4586   <indexterm zone="datatype-oid">
4587    <primary>regproc</primary>
4588   </indexterm>
4589
4590   <indexterm zone="datatype-oid">
4591    <primary>regprocedure</primary>
4592   </indexterm>
4593
4594   <indexterm zone="datatype-oid">
4595    <primary>regrole</primary>
4596   </indexterm>
4597
4598   <indexterm zone="datatype-oid">
4599    <primary>regtype</primary>
4600   </indexterm>
4601
4602   <indexterm zone="datatype-oid">
4603    <primary>xid8</primary>
4604   </indexterm>
4605
4606   <indexterm zone="datatype-oid">
4607    <primary>cid</primary>
4608   </indexterm>
4609
4610   <indexterm zone="datatype-oid">
4611    <primary>tid</primary>
4612   </indexterm>
4613
4614   <indexterm zone="datatype-oid">
4615    <primary>xid</primary>
4616   </indexterm>
4617
4618   <para>
4619    Object identifiers (OIDs) are used internally by
4620    <productname>PostgreSQL</productname> as primary keys for various
4621    system tables.
4622    Type <type>oid</type> represents an object identifier.  There are also
4623    several alias types for <type>oid</type> named <type>reg<replaceable>something</replaceable></type>.
4624    <xref linkend="datatype-oid-table"/> shows an
4625    overview.
4626   </para>
4627
4628   <para>
4629    The <type>oid</type> type is currently implemented as an unsigned
4630    four-byte integer.  Therefore, it is not large enough to provide
4631    database-wide uniqueness in large databases, or even in large
4632    individual tables.
4633   </para>
4634
4635   <para>
4636    The <type>oid</type> type itself has few operations beyond comparison.
4637    It can be cast to integer, however, and then manipulated using the
4638    standard integer operators.  (Beware of possible
4639    signed-versus-unsigned confusion if you do this.)
4640   </para>
4641
4642   <para>
4643    The OID alias types have no operations of their own except
4644    for specialized input and output routines.  These routines are able
4645    to accept and display symbolic names for system objects, rather than
4646    the raw numeric value that type <type>oid</type> would use.  The alias
4647    types allow simplified lookup of OID values for objects.  For example,
4648    to examine the <structname>pg_attribute</structname> rows related to a table
4649    <literal>mytable</literal>, one could write:
4650<programlisting>
4651SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
4652</programlisting>
4653    rather than:
4654<programlisting>
4655SELECT * FROM pg_attribute
4656  WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
4657</programlisting>
4658    While that doesn't look all that bad by itself, it's still oversimplified.
4659    A far more complicated sub-select would be needed to
4660    select the right OID if there are multiple tables named
4661    <literal>mytable</literal> in different schemas.
4662    The <type>regclass</type> input converter handles the table lookup according
4663    to the schema path setting, and so it does the <quote>right thing</quote>
4664    automatically.  Similarly, casting a table's OID to
4665    <type>regclass</type> is handy for symbolic display of a numeric OID.
4666   </para>
4667
4668    <table id="datatype-oid-table">
4669     <title>Object Identifier Types</title>
4670     <tgroup cols="4">
4671      <thead>
4672       <row>
4673        <entry>Name</entry>
4674        <entry>References</entry>
4675        <entry>Description</entry>
4676        <entry>Value Example</entry>
4677       </row>
4678      </thead>
4679
4680      <tbody>
4681
4682       <row>
4683        <entry><type>oid</type></entry>
4684        <entry>any</entry>
4685        <entry>numeric object identifier</entry>
4686        <entry><literal>564182</literal></entry>
4687       </row>
4688
4689       <row>
4690        <entry><type>regclass</type></entry>
4691        <entry><structname>pg_class</structname></entry>
4692        <entry>relation name</entry>
4693        <entry><literal>pg_type</literal></entry>
4694       </row>
4695
4696       <row>
4697        <entry><type>regcollation</type></entry>
4698        <entry><structname>pg_collation</structname></entry>
4699        <entry>collation name</entry>
4700        <entry><literal>"POSIX"</literal></entry>
4701       </row>
4702
4703       <row>
4704        <entry><type>regconfig</type></entry>
4705        <entry><structname>pg_ts_config</structname></entry>
4706        <entry>text search configuration</entry>
4707        <entry><literal>english</literal></entry>
4708       </row>
4709
4710       <row>
4711        <entry><type>regdictionary</type></entry>
4712        <entry><structname>pg_ts_dict</structname></entry>
4713        <entry>text search dictionary</entry>
4714        <entry><literal>simple</literal></entry>
4715       </row>
4716
4717       <row>
4718        <entry><type>regnamespace</type></entry>
4719        <entry><structname>pg_namespace</structname></entry>
4720        <entry>namespace name</entry>
4721        <entry><literal>pg_catalog</literal></entry>
4722       </row>
4723
4724       <row>
4725        <entry><type>regoper</type></entry>
4726        <entry><structname>pg_operator</structname></entry>
4727        <entry>operator name</entry>
4728        <entry><literal>+</literal></entry>
4729       </row>
4730
4731       <row>
4732        <entry><type>regoperator</type></entry>
4733        <entry><structname>pg_operator</structname></entry>
4734        <entry>operator with argument types</entry>
4735        <entry><literal>*(integer,&zwsp;integer)</literal>
4736         or <literal>-(NONE,&zwsp;integer)</literal></entry>
4737       </row>
4738
4739       <row>
4740        <entry><type>regproc</type></entry>
4741        <entry><structname>pg_proc</structname></entry>
4742        <entry>function name</entry>
4743        <entry><literal>sum</literal></entry>
4744       </row>
4745
4746       <row>
4747        <entry><type>regprocedure</type></entry>
4748        <entry><structname>pg_proc</structname></entry>
4749        <entry>function with argument types</entry>
4750        <entry><literal>sum(int4)</literal></entry>
4751       </row>
4752
4753       <row>
4754        <entry><type>regrole</type></entry>
4755        <entry><structname>pg_authid</structname></entry>
4756        <entry>role name</entry>
4757        <entry><literal>smithee</literal></entry>
4758       </row>
4759
4760       <row>
4761        <entry><type>regtype</type></entry>
4762        <entry><structname>pg_type</structname></entry>
4763        <entry>data type name</entry>
4764        <entry><literal>integer</literal></entry>
4765       </row>
4766      </tbody>
4767     </tgroup>
4768    </table>
4769
4770   <para>
4771    All of the OID alias types for objects grouped by namespace accept
4772    schema-qualified names, and will
4773    display schema-qualified names on output if the object would not
4774    be found in the current search path without being qualified.
4775    The <type>regproc</type> and <type>regoper</type> alias types will only
4776    accept input names that are unique (not overloaded), so they are
4777    of limited use; for most uses <type>regprocedure</type> or
4778    <type>regoperator</type> are more appropriate.  For <type>regoperator</type>,
4779    unary operators are identified by writing <literal>NONE</literal> for the unused
4780    operand.
4781   </para>
4782
4783   <para>
4784    An additional property of most of the OID alias types is the creation of
4785    dependencies.  If a
4786    constant of one of these types appears in a stored expression
4787    (such as a column default expression or view), it creates a dependency
4788    on the referenced object.  For example, if a column has a default
4789    expression <literal>nextval('my_seq'::regclass)</literal>,
4790    <productname>PostgreSQL</productname>
4791    understands that the default expression depends on the sequence
4792    <literal>my_seq</literal>; the system will not let the sequence be dropped
4793    without first removing the default expression.
4794    <type>regrole</type> is the only exception for the property. Constants of this
4795    type are not allowed in such expressions.
4796   </para>
4797
4798   <note>
4799   <para>
4800    The OID alias types do not completely follow transaction isolation
4801    rules. The planner also treats them as simple constants, which may
4802    result in sub-optimal planning.
4803   </para>
4804   </note>
4805
4806   <para>
4807    Another identifier type used by the system is <type>xid</type>, or transaction
4808    (abbreviated <abbrev>xact</abbrev>) identifier.  This is the data type of the system columns
4809    <structfield>xmin</structfield> and <structfield>xmax</structfield>.  Transaction identifiers are 32-bit quantities.
4810    In some contexts, a 64-bit variant <type>xid8</type> is used.  Unlike
4811    <type>xid</type> values, <type>xid8</type> values increase strictly
4812    monotonically and cannot be reused in the lifetime of a database cluster.
4813   </para>
4814
4815   <para>
4816    A third identifier type used by the system is <type>cid</type>, or
4817    command identifier.  This is the data type of the system columns
4818    <structfield>cmin</structfield> and <structfield>cmax</structfield>. Command identifiers are also 32-bit quantities.
4819   </para>
4820
4821   <para>
4822    A final identifier type used by the system is <type>tid</type>, or tuple
4823    identifier (row identifier).  This is the data type of the system column
4824    <structfield>ctid</structfield>.  A tuple ID is a pair
4825    (block number, tuple index within block) that identifies the
4826    physical location of the row within its table.
4827   </para>
4828
4829   <para>
4830    (The system columns are further explained in <xref
4831    linkend="ddl-system-columns"/>.)
4832   </para>
4833  </sect1>
4834
4835  <sect1 id="datatype-pg-lsn">
4836   <title><acronym>pg_lsn Type</acronym></title>
4837
4838   <indexterm zone="datatype-pg-lsn">
4839    <primary>pg_lsn</primary>
4840   </indexterm>
4841
4842   <para>
4843    The <type>pg_lsn</type> data type can be used to store LSN (Log Sequence
4844    Number) data which is a pointer to a location in the WAL. This type is a
4845    representation of <type>XLogRecPtr</type> and an internal system type of
4846    <productname>PostgreSQL</productname>.
4847   </para>
4848
4849   <para>
4850    Internally, an LSN is a 64-bit integer, representing a byte position in
4851    the write-ahead log stream.  It is printed as two hexadecimal numbers of
4852    up to 8 digits each, separated by a slash; for example,
4853    <literal>16/B374D848</literal>.  The <type>pg_lsn</type> type supports the
4854    standard comparison operators, like <literal>=</literal> and
4855    <literal>&gt;</literal>.  Two LSNs can be subtracted using the
4856    <literal>-</literal> operator; the result is the number of bytes separating
4857    those write-ahead log locations.
4858   </para>
4859  </sect1>
4860
4861  <sect1 id="datatype-pseudo">
4862   <title>Pseudo-Types</title>
4863
4864   <indexterm zone="datatype-pseudo">
4865    <primary>record</primary>
4866   </indexterm>
4867
4868   <indexterm zone="datatype-pseudo">
4869    <primary>any</primary>
4870   </indexterm>
4871
4872   <indexterm zone="datatype-pseudo">
4873    <primary>anyelement</primary>
4874   </indexterm>
4875
4876   <indexterm zone="datatype-pseudo">
4877    <primary>anyarray</primary>
4878   </indexterm>
4879
4880   <indexterm zone="datatype-pseudo">
4881    <primary>anynonarray</primary>
4882   </indexterm>
4883
4884   <indexterm zone="datatype-pseudo">
4885    <primary>anyenum</primary>
4886   </indexterm>
4887
4888   <indexterm zone="datatype-pseudo">
4889    <primary>anyrange</primary>
4890   </indexterm>
4891
4892   <indexterm zone="datatype-pseudo">
4893    <primary>anycompatible</primary>
4894   </indexterm>
4895
4896   <indexterm zone="datatype-pseudo">
4897    <primary>anycompatiblearray</primary>
4898   </indexterm>
4899
4900   <indexterm zone="datatype-pseudo">
4901    <primary>anycompatiblenonarray</primary>
4902   </indexterm>
4903
4904   <indexterm zone="datatype-pseudo">
4905    <primary>anycompatiblerange</primary>
4906   </indexterm>
4907
4908   <indexterm zone="datatype-pseudo">
4909    <primary>void</primary>
4910   </indexterm>
4911
4912   <indexterm zone="datatype-pseudo">
4913    <primary>trigger</primary>
4914   </indexterm>
4915
4916   <indexterm zone="datatype-pseudo">
4917    <primary>event_trigger</primary>
4918   </indexterm>
4919
4920   <indexterm zone="datatype-pseudo">
4921    <primary>pg_ddl_command</primary>
4922   </indexterm>
4923
4924   <indexterm zone="datatype-pseudo">
4925    <primary>language_handler</primary>
4926   </indexterm>
4927
4928   <indexterm zone="datatype-pseudo">
4929    <primary>fdw_handler</primary>
4930   </indexterm>
4931
4932   <indexterm zone="datatype-pseudo">
4933    <primary>table_am_handler</primary>
4934   </indexterm>
4935
4936   <indexterm zone="datatype-pseudo">
4937    <primary>index_am_handler</primary>
4938   </indexterm>
4939
4940   <indexterm zone="datatype-pseudo">
4941    <primary>tsm_handler</primary>
4942   </indexterm>
4943
4944   <indexterm zone="datatype-pseudo">
4945    <primary>cstring</primary>
4946   </indexterm>
4947
4948   <indexterm zone="datatype-pseudo">
4949    <primary>internal</primary>
4950   </indexterm>
4951
4952   <indexterm zone="datatype-pseudo">
4953    <primary>unknown</primary>
4954   </indexterm>
4955
4956   <para>
4957    The <productname>PostgreSQL</productname> type system contains a
4958    number of special-purpose entries that are collectively called
4959    <firstterm>pseudo-types</firstterm>.  A pseudo-type cannot be used as a
4960    column data type, but it can be used to declare a function's
4961    argument or result type.  Each of the available pseudo-types is
4962    useful in situations where a function's behavior does not
4963    correspond to simply taking or returning a value of a specific
4964    <acronym>SQL</acronym> data type.  <xref
4965    linkend="datatype-pseudotypes-table"/> lists the existing
4966    pseudo-types.
4967   </para>
4968
4969    <table id="datatype-pseudotypes-table">
4970     <title>Pseudo-Types</title>
4971     <tgroup cols="2">
4972      <colspec colname="col1" colwidth="1*"/>
4973      <colspec colname="col2" colwidth="2*"/>
4974      <thead>
4975       <row>
4976        <entry>Name</entry>
4977        <entry>Description</entry>
4978       </row>
4979      </thead>
4980
4981      <tbody>
4982       <row>
4983        <entry><type>any</type></entry>
4984        <entry>Indicates that a function accepts any input data type.</entry>
4985       </row>
4986
4987       <row>
4988        <entry><type>anyelement</type></entry>
4989        <entry>Indicates that a function accepts any data type
4990        (see <xref linkend="extend-types-polymorphic"/>).</entry>
4991       </row>
4992
4993       <row>
4994        <entry><type>anyarray</type></entry>
4995        <entry>Indicates that a function accepts any array data type
4996        (see <xref linkend="extend-types-polymorphic"/>).</entry>
4997       </row>
4998
4999       <row>
5000        <entry><type>anynonarray</type></entry>
5001        <entry>Indicates that a function accepts any non-array data type
5002        (see <xref linkend="extend-types-polymorphic"/>).</entry>
5003       </row>
5004
5005       <row>
5006        <entry><type>anyenum</type></entry>
5007        <entry>Indicates that a function accepts any enum data type
5008        (see <xref linkend="extend-types-polymorphic"/> and
5009        <xref linkend="datatype-enum"/>).</entry>
5010       </row>
5011
5012       <row>
5013        <entry><type>anyrange</type></entry>
5014        <entry>Indicates that a function accepts any range data type
5015        (see <xref linkend="extend-types-polymorphic"/> and
5016        <xref linkend="rangetypes"/>).</entry>
5017       </row>
5018
5019       <row>
5020        <entry><type>anycompatible</type></entry>
5021        <entry>Indicates that a function accepts any data type,
5022        with automatic promotion of multiple arguments to a common data type
5023        (see <xref linkend="extend-types-polymorphic"/>).</entry>
5024       </row>
5025
5026       <row>
5027        <entry><type>anycompatiblearray</type></entry>
5028        <entry>Indicates that a function accepts any array data type,
5029        with automatic promotion of multiple arguments to a common data type
5030        (see <xref linkend="extend-types-polymorphic"/>).</entry>
5031       </row>
5032
5033       <row>
5034        <entry><type>anycompatiblenonarray</type></entry>
5035        <entry>Indicates that a function accepts any non-array data type,
5036        with automatic promotion of multiple arguments to a common data type
5037        (see <xref linkend="extend-types-polymorphic"/>).</entry>
5038       </row>
5039
5040       <row>
5041        <entry><type>anycompatiblerange</type></entry>
5042        <entry>Indicates that a function accepts any range data type,
5043        with automatic promotion of multiple arguments to a common data type
5044        (see <xref linkend="extend-types-polymorphic"/> and
5045        <xref linkend="rangetypes"/>).</entry>
5046       </row>
5047
5048       <row>
5049        <entry><type>cstring</type></entry>
5050        <entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
5051       </row>
5052
5053       <row>
5054        <entry><type>internal</type></entry>
5055        <entry>Indicates that a function accepts or returns a server-internal
5056        data type.</entry>
5057       </row>
5058
5059       <row>
5060        <entry><type>language_handler</type></entry>
5061        <entry>A procedural language call handler is declared to return <type>language_handler</type>.</entry>
5062       </row>
5063
5064       <row>
5065        <entry><type>fdw_handler</type></entry>
5066        <entry>A foreign-data wrapper handler is declared to return <type>fdw_handler</type>.</entry>
5067       </row>
5068
5069       <row>
5070        <entry><type>table_am_handler</type></entry>
5071        <entry>A table access method handler is declared to return <type>table_am_handler</type>.</entry>
5072       </row>
5073
5074       <row>
5075        <entry><type>index_am_handler</type></entry>
5076        <entry>An index access method handler is declared to return <type>index_am_handler</type>.</entry>
5077       </row>
5078
5079       <row>
5080        <entry><type>tsm_handler</type></entry>
5081        <entry>A tablesample method handler is declared to return <type>tsm_handler</type>.</entry>
5082       </row>
5083
5084       <row>
5085        <entry><type>record</type></entry>
5086        <entry>Identifies a function taking or returning an unspecified row type.</entry>
5087       </row>
5088
5089       <row>
5090        <entry><type>trigger</type></entry>
5091        <entry>A trigger function is declared to return <type>trigger.</type></entry>
5092       </row>
5093
5094       <row>
5095        <entry><type>event_trigger</type></entry>
5096        <entry>An event trigger function is declared to return <type>event_trigger.</type></entry>
5097       </row>
5098
5099       <row>
5100        <entry><type>pg_ddl_command</type></entry>
5101        <entry>Identifies a representation of DDL commands that is available to event triggers.</entry>
5102       </row>
5103
5104       <row>
5105        <entry><type>void</type></entry>
5106        <entry>Indicates that a function returns no value.</entry>
5107       </row>
5108
5109       <row>
5110        <entry><type>unknown</type></entry>
5111        <entry>Identifies a not-yet-resolved type, e.g., of an undecorated
5112         string literal.</entry>
5113       </row>
5114      </tbody>
5115     </tgroup>
5116    </table>
5117
5118   <para>
5119    Functions coded in C (whether built-in or dynamically loaded) can be
5120    declared to accept or return any of these pseudo-types.  It is up to
5121    the function author to ensure that the function will behave safely
5122    when a pseudo-type is used as an argument type.
5123   </para>
5124
5125   <para>
5126    Functions coded in procedural languages can use pseudo-types only as
5127    allowed by their implementation languages.  At present most procedural
5128    languages forbid use of a pseudo-type as an argument type, and allow
5129    only <type>void</type> and <type>record</type> as a result type (plus
5130    <type>trigger</type> or <type>event_trigger</type> when the function is used
5131    as a trigger or event trigger).  Some also support polymorphic functions
5132    using the polymorphic pseudo-types, which are shown above and discussed
5133    in detail in <xref linkend="extend-types-polymorphic"/>.
5134   </para>
5135
5136   <para>
5137    The <type>internal</type> pseudo-type is used to declare functions
5138    that are meant only to be called internally by the database
5139    system, and not by direct invocation in an <acronym>SQL</acronym>
5140    query.  If a function has at least one <type>internal</type>-type
5141    argument then it cannot be called from <acronym>SQL</acronym>.  To
5142    preserve the type safety of this restriction it is important to
5143    follow this coding rule: do not create any function that is
5144    declared to return <type>internal</type> unless it has at least one
5145    <type>internal</type> argument.
5146   </para>
5147
5148  </sect1>
5149
5150 </chapter>
5151