1<!-- doc/src/sgml/datatype.sgml --> 2 3 <chapter id="datatype"> 4 <title>Data Types</title> 5 6 <indexterm zone="datatype"> 7 <primary>data type</primary> 8 </indexterm> 9 10 <indexterm> 11 <primary>type</primary> 12 <see>data type</see> 13 </indexterm> 14 15 <para> 16 <productname>PostgreSQL</productname> has a rich set of native data 17 types available to users. Users can add new types to 18 <productname>PostgreSQL</productname> using the <xref 19 linkend="sql-createtype"/> command. 20 </para> 21 22 <para> 23 <xref linkend="datatype-table"/> shows all the built-in general-purpose data 24 types. Most of the alternative names listed in the 25 <quote>Aliases</quote> column are the names used internally by 26 <productname>PostgreSQL</productname> for historical reasons. In 27 addition, some internally used or deprecated types are available, 28 but are not listed here. 29 </para> 30 31 <table id="datatype-table"> 32 <title>Data Types</title> 33 <tgroup cols="3"> 34 <colspec colname="col1" colwidth="2*"/> 35 <colspec colname="col2" colwidth="1*"/> 36 <colspec colname="col3" colwidth="2*"/> 37 <thead> 38 <row> 39 <entry>Name</entry> 40 <entry>Aliases</entry> 41 <entry>Description</entry> 42 </row> 43 </thead> 44 45 <tbody> 46 <row> 47 <entry><type>bigint</type></entry> 48 <entry><type>int8</type></entry> 49 <entry>signed eight-byte integer</entry> 50 </row> 51 52 <row> 53 <entry><type>bigserial</type></entry> 54 <entry><type>serial8</type></entry> 55 <entry>autoincrementing eight-byte integer</entry> 56 </row> 57 58 <row> 59 <entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry> 60 <entry></entry> 61 <entry>fixed-length bit string</entry> 62 </row> 63 64 <row> 65 <entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry> 66 <entry><type>varbit [ (<replaceable>n</replaceable>) ]</type></entry> 67 <entry>variable-length bit string</entry> 68 </row> 69 70 <row> 71 <entry><type>boolean</type></entry> 72 <entry><type>bool</type></entry> 73 <entry>logical Boolean (true/false)</entry> 74 </row> 75 76 <row> 77 <entry><type>box</type></entry> 78 <entry></entry> 79 <entry>rectangular box on a plane</entry> 80 </row> 81 82 <row> 83 <entry><type>bytea</type></entry> 84 <entry></entry> 85 <entry>binary data (<quote>byte array</quote>)</entry> 86 </row> 87 88 <row> 89 <entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry> 90 <entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry> 91 <entry>fixed-length character string</entry> 92 </row> 93 94 <row> 95 <entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry> 96 <entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry> 97 <entry>variable-length character string</entry> 98 </row> 99 100 <row> 101 <entry><type>cidr</type></entry> 102 <entry></entry> 103 <entry>IPv4 or IPv6 network address</entry> 104 </row> 105 106 <row> 107 <entry><type>circle</type></entry> 108 <entry></entry> 109 <entry>circle on a plane</entry> 110 </row> 111 112 <row> 113 <entry><type>date</type></entry> 114 <entry></entry> 115 <entry>calendar date (year, month, day)</entry> 116 </row> 117 118 <row> 119 <entry><type>double precision</type></entry> 120 <entry><type>float8</type></entry> 121 <entry>double precision floating-point number (8 bytes)</entry> 122 </row> 123 124 <row> 125 <entry><type>inet</type></entry> 126 <entry></entry> 127 <entry>IPv4 or IPv6 host address</entry> 128 </row> 129 130 <row> 131 <entry><type>integer</type></entry> 132 <entry><type>int</type>, <type>int4</type></entry> 133 <entry>signed four-byte integer</entry> 134 </row> 135 136 <row> 137 <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry> 138 <entry></entry> 139 <entry>time span</entry> 140 </row> 141 142 <row> 143 <entry><type>json</type></entry> 144 <entry></entry> 145 <entry>textual JSON data</entry> 146 </row> 147 148 <row> 149 <entry><type>jsonb</type></entry> 150 <entry></entry> 151 <entry>binary JSON data, decomposed</entry> 152 </row> 153 154 <row> 155 <entry><type>line</type></entry> 156 <entry></entry> 157 <entry>infinite line on a plane</entry> 158 </row> 159 160 <row> 161 <entry><type>lseg</type></entry> 162 <entry></entry> 163 <entry>line segment on a plane</entry> 164 </row> 165 166 <row> 167 <entry><type>macaddr</type></entry> 168 <entry></entry> 169 <entry>MAC (Media Access Control) address</entry> 170 </row> 171 172 <row> 173 <entry><type>macaddr8</type></entry> 174 <entry></entry> 175 <entry>MAC (Media Access Control) address (EUI-64 format)</entry> 176 </row> 177 178 <row> 179 <entry><type>money</type></entry> 180 <entry></entry> 181 <entry>currency amount</entry> 182 </row> 183 184 <row> 185 <entry><type>numeric [ (<replaceable>p</replaceable>, 186 <replaceable>s</replaceable>) ]</type></entry> 187 <entry><type>decimal [ (<replaceable>p</replaceable>, 188 <replaceable>s</replaceable>) ]</type></entry> 189 <entry>exact numeric of selectable precision</entry> 190 </row> 191 192 <row> 193 <entry><type>path</type></entry> 194 <entry></entry> 195 <entry>geometric path on a plane</entry> 196 </row> 197 198 <row> 199 <entry><type>pg_lsn</type></entry> 200 <entry></entry> 201 <entry><productname>PostgreSQL</productname> Log Sequence Number</entry> 202 </row> 203 204 <row> 205 <entry><type>pg_snapshot</type></entry> 206 <entry></entry> 207 <entry>user-level transaction ID snapshot</entry> 208 </row> 209 210 <row> 211 <entry><type>point</type></entry> 212 <entry></entry> 213 <entry>geometric point on a plane</entry> 214 </row> 215 216 <row> 217 <entry><type>polygon</type></entry> 218 <entry></entry> 219 <entry>closed geometric path on a plane</entry> 220 </row> 221 222 <row> 223 <entry><type>real</type></entry> 224 <entry><type>float4</type></entry> 225 <entry>single precision floating-point number (4 bytes)</entry> 226 </row> 227 228 <row> 229 <entry><type>smallint</type></entry> 230 <entry><type>int2</type></entry> 231 <entry>signed two-byte integer</entry> 232 </row> 233 234 <row> 235 <entry><type>smallserial</type></entry> 236 <entry><type>serial2</type></entry> 237 <entry>autoincrementing two-byte integer</entry> 238 </row> 239 240 <row> 241 <entry><type>serial</type></entry> 242 <entry><type>serial4</type></entry> 243 <entry>autoincrementing four-byte integer</entry> 244 </row> 245 246 <row> 247 <entry><type>text</type></entry> 248 <entry></entry> 249 <entry>variable-length character string</entry> 250 </row> 251 252 <row> 253 <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry> 254 <entry></entry> 255 <entry>time of day (no time zone)</entry> 256 </row> 257 258 <row> 259 <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry> 260 <entry><type>timetz</type></entry> 261 <entry>time of day, including time zone</entry> 262 </row> 263 264 <row> 265 <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry> 266 <entry></entry> 267 <entry>date and time (no time zone)</entry> 268 </row> 269 270 <row> 271 <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry> 272 <entry><type>timestamptz</type></entry> 273 <entry>date and time, including time zone</entry> 274 </row> 275 276 <row> 277 <entry><type>tsquery</type></entry> 278 <entry></entry> 279 <entry>text search query</entry> 280 </row> 281 282 <row> 283 <entry><type>tsvector</type></entry> 284 <entry></entry> 285 <entry>text search document</entry> 286 </row> 287 288 <row> 289 <entry><type>txid_snapshot</type></entry> 290 <entry></entry> 291 <entry>user-level transaction ID snapshot (deprecated; see <type>pg_snapshot</type>)</entry> 292 </row> 293 294 <row> 295 <entry><type>uuid</type></entry> 296 <entry></entry> 297 <entry>universally unique identifier</entry> 298 </row> 299 300 <row> 301 <entry><type>xml</type></entry> 302 <entry></entry> 303 <entry>XML data</entry> 304 </row> 305 </tbody> 306 </tgroup> 307 </table> 308 309 <note> 310 <title>Compatibility</title> 311 <para> 312 The following types (or spellings thereof) are specified by 313 <acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit 314 varying</type>, <type>boolean</type>, <type>char</type>, 315 <type>character varying</type>, <type>character</type>, 316 <type>varchar</type>, <type>date</type>, <type>double 317 precision</type>, <type>integer</type>, <type>interval</type>, 318 <type>numeric</type>, <type>decimal</type>, <type>real</type>, 319 <type>smallint</type>, <type>time</type> (with or without time zone), 320 <type>timestamp</type> (with or without time zone), 321 <type>xml</type>. 322 </para> 323 </note> 324 325 <para> 326 Each data type has an external representation determined by its input 327 and output functions. Many of the built-in types have 328 obvious external formats. However, several types are either unique 329 to <productname>PostgreSQL</productname>, such as geometric 330 paths, or have several possible formats, such as the date 331 and time types. 332 Some of the input and output functions are not invertible, i.e., 333 the result of an output function might lose accuracy when compared to 334 the original input. 335 </para> 336 337 <sect1 id="datatype-numeric"> 338 <title>Numeric Types</title> 339 340 <indexterm zone="datatype-numeric"> 341 <primary>data type</primary> 342 <secondary>numeric</secondary> 343 </indexterm> 344 345 <para> 346 Numeric types consist of two-, four-, and eight-byte integers, 347 four- and eight-byte floating-point numbers, and selectable-precision 348 decimals. <xref linkend="datatype-numeric-table"/> lists the 349 available types. 350 </para> 351 352 <table id="datatype-numeric-table"> 353 <title>Numeric Types</title> 354 <tgroup cols="4"> 355 <colspec colname="col1" colwidth="2*"/> 356 <colspec colname="col2" colwidth="1*"/> 357 <colspec colname="col3" colwidth="2*"/> 358 <colspec colname="col4" colwidth="2*"/> 359 <thead> 360 <row> 361 <entry>Name</entry> 362 <entry>Storage Size</entry> 363 <entry>Description</entry> 364 <entry>Range</entry> 365 </row> 366 </thead> 367 368 <tbody> 369 <row> 370 <entry><type>smallint</type></entry> 371 <entry>2 bytes</entry> 372 <entry>small-range integer</entry> 373 <entry>-32768 to +32767</entry> 374 </row> 375 <row> 376 <entry><type>integer</type></entry> 377 <entry>4 bytes</entry> 378 <entry>typical choice for integer</entry> 379 <entry>-2147483648 to +2147483647</entry> 380 </row> 381 <row> 382 <entry><type>bigint</type></entry> 383 <entry>8 bytes</entry> 384 <entry>large-range integer</entry> 385 <entry>-9223372036854775808 to +9223372036854775807</entry> 386 </row> 387 388 <row> 389 <entry><type>decimal</type></entry> 390 <entry>variable</entry> 391 <entry>user-specified precision, exact</entry> 392 <entry>up to 131072 digits before the decimal point; up to 16383 digits after the decimal point</entry> 393 </row> 394 <row> 395 <entry><type>numeric</type></entry> 396 <entry>variable</entry> 397 <entry>user-specified precision, exact</entry> 398 <entry>up to 131072 digits before the decimal point; up to 16383 digits after the decimal point</entry> 399 </row> 400 401 <row> 402 <entry><type>real</type></entry> 403 <entry>4 bytes</entry> 404 <entry>variable-precision, inexact</entry> 405 <entry>6 decimal digits precision</entry> 406 </row> 407 <row> 408 <entry><type>double precision</type></entry> 409 <entry>8 bytes</entry> 410 <entry>variable-precision, inexact</entry> 411 <entry>15 decimal digits precision</entry> 412 </row> 413 414 <row> 415 <entry><type>smallserial</type></entry> 416 <entry>2 bytes</entry> 417 <entry>small autoincrementing integer</entry> 418 <entry>1 to 32767</entry> 419 </row> 420 421 <row> 422 <entry><type>serial</type></entry> 423 <entry>4 bytes</entry> 424 <entry>autoincrementing integer</entry> 425 <entry>1 to 2147483647</entry> 426 </row> 427 428 <row> 429 <entry><type>bigserial</type></entry> 430 <entry>8 bytes</entry> 431 <entry>large autoincrementing integer</entry> 432 <entry>1 to 9223372036854775807</entry> 433 </row> 434 </tbody> 435 </tgroup> 436 </table> 437 438 <para> 439 The syntax of constants for the numeric types is described in 440 <xref linkend="sql-syntax-constants"/>. The numeric types have a 441 full set of corresponding arithmetic operators and 442 functions. Refer to <xref linkend="functions"/> for more 443 information. The following sections describe the types in detail. 444 </para> 445 446 <sect2 id="datatype-int"> 447 <title>Integer Types</title> 448 449 <indexterm zone="datatype-int"> 450 <primary>integer</primary> 451 </indexterm> 452 453 <indexterm zone="datatype-int"> 454 <primary>smallint</primary> 455 </indexterm> 456 457 <indexterm zone="datatype-int"> 458 <primary>bigint</primary> 459 </indexterm> 460 461 <indexterm> 462 <primary>int4</primary> 463 <see>integer</see> 464 </indexterm> 465 466 <indexterm> 467 <primary>int2</primary> 468 <see>smallint</see> 469 </indexterm> 470 471 <indexterm> 472 <primary>int8</primary> 473 <see>bigint</see> 474 </indexterm> 475 476 <para> 477 The types <type>smallint</type>, <type>integer</type>, and 478 <type>bigint</type> store whole numbers, that is, numbers without 479 fractional components, of various ranges. Attempts to store 480 values outside of the allowed range will result in an error. 481 </para> 482 483 <para> 484 The type <type>integer</type> is the common choice, as it offers 485 the best balance between range, storage size, and performance. 486 The <type>smallint</type> type is generally only used if disk 487 space is at a premium. The <type>bigint</type> type is designed to be 488 used when the range of the <type>integer</type> type is insufficient. 489 </para> 490 491 <para> 492 <acronym>SQL</acronym> only specifies the integer types 493 <type>integer</type> (or <type>int</type>), 494 <type>smallint</type>, and <type>bigint</type>. The 495 type names <type>int2</type>, <type>int4</type>, and 496 <type>int8</type> are extensions, which are also used by some 497 other <acronym>SQL</acronym> database systems. 498 </para> 499 500 </sect2> 501 502 <sect2 id="datatype-numeric-decimal"> 503 <title>Arbitrary Precision Numbers</title> 504 505 <indexterm> 506 <primary>numeric (data type)</primary> 507 </indexterm> 508 509 <indexterm> 510 <primary>arbitrary precision numbers</primary> 511 </indexterm> 512 513 <indexterm> 514 <primary>decimal</primary> 515 <see>numeric</see> 516 </indexterm> 517 518 <para> 519 The type <type>numeric</type> can store numbers with a 520 very large number of digits. It is especially recommended for 521 storing monetary amounts and other quantities where exactness is 522 required. Calculations with <type>numeric</type> values yield exact 523 results where possible, e.g., addition, subtraction, multiplication. 524 However, calculations on <type>numeric</type> values are very slow 525 compared to the integer types, or to the floating-point types 526 described in the next section. 527 </para> 528 529 <para> 530 We use the following terms below: The 531 <firstterm>precision</firstterm> of a <type>numeric</type> 532 is the total count of significant digits in the whole number, 533 that is, the number of digits to both sides of the decimal point. 534 The <firstterm>scale</firstterm> of a <type>numeric</type> is the 535 count of decimal digits in the fractional part, to the right of the 536 decimal point. So the number 23.5141 has a precision of 6 and a 537 scale of 4. Integers can be considered to have a scale of zero. 538 </para> 539 540 <para> 541 Both the maximum precision and the maximum scale of a 542 <type>numeric</type> column can be 543 configured. To declare a column of type <type>numeric</type> use 544 the syntax: 545<programlisting> 546NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>) 547</programlisting> 548 The precision must be positive, the scale zero or positive. 549 Alternatively: 550<programlisting> 551NUMERIC(<replaceable>precision</replaceable>) 552</programlisting> 553 selects a scale of 0. Specifying: 554<programlisting> 555NUMERIC 556</programlisting> 557 without any precision or scale creates a column in which numeric 558 values of any precision and scale can be stored, up to the 559 implementation limit on precision. A column of this kind will 560 not coerce input values to any particular scale, whereas 561 <type>numeric</type> columns with a declared scale will coerce 562 input values to that scale. (The <acronym>SQL</acronym> standard 563 requires a default scale of 0, i.e., coercion to integer 564 precision. We find this a bit useless. If you're concerned 565 about portability, always specify the precision and scale 566 explicitly.) 567 </para> 568 569 <note> 570 <para> 571 The maximum allowed precision when explicitly specified in the 572 type declaration is 1000; <type>NUMERIC</type> without a specified 573 precision is subject to the limits described in <xref 574 linkend="datatype-numeric-table"/>. 575 </para> 576 </note> 577 578 <para> 579 If the scale of a value to be stored is greater than the declared 580 scale of the column, the system will round the value to the specified 581 number of fractional digits. Then, if the number of digits to the 582 left of the decimal point exceeds the declared precision minus the 583 declared scale, an error is raised. 584 </para> 585 586 <para> 587 Numeric values are physically stored without any extra leading or 588 trailing zeroes. Thus, the declared precision and scale of a column 589 are maximums, not fixed allocations. (In this sense the <type>numeric</type> 590 type is more akin to <type>varchar(<replaceable>n</replaceable>)</type> 591 than to <type>char(<replaceable>n</replaceable>)</type>.) The actual storage 592 requirement is two bytes for each group of four decimal digits, 593 plus three to eight bytes overhead. 594 </para> 595 596 <indexterm> 597 <primary>NaN</primary> 598 <see>not a number</see> 599 </indexterm> 600 601 <indexterm> 602 <primary>not a number</primary> 603 <secondary>numeric (data type)</secondary> 604 </indexterm> 605 606 <para> 607 In addition to ordinary numeric values, the <type>numeric</type> 608 type allows the special value <literal>NaN</literal>, meaning 609 <quote>not-a-number</quote>. Any operation on <literal>NaN</literal> 610 yields another <literal>NaN</literal>. When writing this value 611 as a constant in an SQL command, you must put quotes around it, 612 for example <literal>UPDATE table SET x = 'NaN'</literal>. On input, 613 the string <literal>NaN</literal> is recognized in a case-insensitive manner. 614 </para> 615 616 <note> 617 <para> 618 In most implementations of the <quote>not-a-number</quote> concept, 619 <literal>NaN</literal> is not considered equal to any other numeric 620 value (including <literal>NaN</literal>). In order to allow 621 <type>numeric</type> values to be sorted and used in tree-based 622 indexes, <productname>PostgreSQL</productname> treats <literal>NaN</literal> 623 values as equal, and greater than all non-<literal>NaN</literal> 624 values. 625 </para> 626 </note> 627 628 <para> 629 The types <type>decimal</type> and <type>numeric</type> are 630 equivalent. Both types are part of the <acronym>SQL</acronym> 631 standard. 632 </para> 633 634 <para> 635 When rounding values, the <type>numeric</type> type rounds ties away 636 from zero, while (on most machines) the <type>real</type> 637 and <type>double precision</type> types round ties to the nearest even 638 number. For example: 639 640<programlisting> 641SELECT x, 642 round(x::numeric) AS num_round, 643 round(x::double precision) AS dbl_round 644FROM generate_series(-3.5, 3.5, 1) as x; 645 x | num_round | dbl_round 646------+-----------+----------- 647 -3.5 | -4 | -4 648 -2.5 | -3 | -2 649 -1.5 | -2 | -2 650 -0.5 | -1 | -0 651 0.5 | 1 | 0 652 1.5 | 2 | 2 653 2.5 | 3 | 2 654 3.5 | 4 | 4 655(8 rows) 656</programlisting> 657 </para> 658 </sect2> 659 660 661 <sect2 id="datatype-float"> 662 <title>Floating-Point Types</title> 663 664 <indexterm zone="datatype-float"> 665 <primary>real</primary> 666 </indexterm> 667 668 <indexterm zone="datatype-float"> 669 <primary>double precision</primary> 670 </indexterm> 671 672 <indexterm> 673 <primary>float4</primary> 674 <see>real</see> 675 </indexterm> 676 677 <indexterm> 678 <primary>float8</primary> 679 <see>double precision</see> 680 </indexterm> 681 682 <indexterm zone="datatype-float"> 683 <primary>floating point</primary> 684 </indexterm> 685 686 <para> 687 The data types <type>real</type> and <type>double precision</type> are 688 inexact, variable-precision numeric types. On all currently supported 689 platforms, these types are implementations of <acronym>IEEE</acronym> 690 Standard 754 for Binary Floating-Point Arithmetic (single and double 691 precision, respectively), to the extent that the underlying processor, 692 operating system, and compiler support it. 693 </para> 694 695 <para> 696 Inexact means that some values cannot be converted exactly to the 697 internal format and are stored as approximations, so that storing 698 and retrieving a value might show slight discrepancies. 699 Managing these errors and how they propagate through calculations 700 is the subject of an entire branch of mathematics and computer 701 science and will not be discussed here, except for the 702 following points: 703 <itemizedlist> 704 <listitem> 705 <para> 706 If you require exact storage and calculations (such as for 707 monetary amounts), use the <type>numeric</type> type instead. 708 </para> 709 </listitem> 710 711 <listitem> 712 <para> 713 If you want to do complicated calculations with these types 714 for anything important, especially if you rely on certain 715 behavior in boundary cases (infinity, underflow), you should 716 evaluate the implementation carefully. 717 </para> 718 </listitem> 719 720 <listitem> 721 <para> 722 Comparing two floating-point values for equality might not 723 always work as expected. 724 </para> 725 </listitem> 726 </itemizedlist> 727 </para> 728 729 <para> 730 On all currently supported platforms, the <type>real</type> type has a 731 range of around 1E-37 to 1E+37 with a precision of at least 6 decimal 732 digits. The <type>double precision</type> type has a range of around 733 1E-307 to 1E+308 with a precision of at least 15 digits. Values that are 734 too large or too small will cause an error. Rounding might take place if 735 the precision of an input number is too high. Numbers too close to zero 736 that are not representable as distinct from zero will cause an underflow 737 error. 738 </para> 739 740 <para> 741 By default, floating point values are output in text form in their 742 shortest precise decimal representation; the decimal value produced is 743 closer to the true stored binary value than to any other value 744 representable in the same binary precision. (However, the output value is 745 currently never <emphasis>exactly</emphasis> midway between two 746 representable values, in order to avoid a widespread bug where input 747 routines do not properly respect the round-to-nearest-even rule.) This value will 748 use at most 17 significant decimal digits for <type>float8</type> 749 values, and at most 9 digits for <type>float4</type> values. 750 </para> 751 752 <note> 753 <para> 754 This shortest-precise output format is much faster to generate than the 755 historical rounded format. 756 </para> 757 </note> 758 759 <para> 760 For compatibility with output generated by older versions 761 of <productname>PostgreSQL</productname>, and to allow the output 762 precision to be reduced, the <xref linkend="guc-extra-float-digits"/> 763 parameter can be used to select rounded decimal output instead. Setting a 764 value of 0 restores the previous default of rounding the value to 6 765 (for <type>float4</type>) or 15 (for <type>float8</type>) 766 significant decimal digits. Setting a negative value reduces the number 767 of digits further; for example -2 would round output to 4 or 13 digits 768 respectively. 769 </para> 770 771 <para> 772 Any value of <xref linkend="guc-extra-float-digits"/> greater than 0 773 selects the shortest-precise format. 774 </para> 775 776 <note> 777 <para> 778 Applications that wanted precise values have historically had to set 779 <xref linkend="guc-extra-float-digits"/> to 3 to obtain them. For 780 maximum compatibility between versions, they should continue to do so. 781 </para> 782 </note> 783 784 <indexterm> 785 <primary>not a number</primary> 786 <secondary>double precision</secondary> 787 </indexterm> 788 789 <para> 790 In addition to ordinary numeric values, the floating-point types 791 have several special values: 792<literallayout> 793<literal>Infinity</literal> 794<literal>-Infinity</literal> 795<literal>NaN</literal> 796</literallayout> 797 These represent the IEEE 754 special values 798 <quote>infinity</quote>, <quote>negative infinity</quote>, and 799 <quote>not-a-number</quote>, respectively. When writing these values 800 as constants in an SQL command, you must put quotes around them, 801 for example <literal>UPDATE table SET x = '-Infinity'</literal>. On input, 802 these strings are recognized in a case-insensitive manner. 803 </para> 804 805 <note> 806 <para> 807 IEEE754 specifies that <literal>NaN</literal> should not compare equal 808 to any other floating-point value (including <literal>NaN</literal>). 809 In order to allow floating-point values to be sorted and used 810 in tree-based indexes, <productname>PostgreSQL</productname> treats 811 <literal>NaN</literal> values as equal, and greater than all 812 non-<literal>NaN</literal> values. 813 </para> 814 </note> 815 816 <para> 817 <productname>PostgreSQL</productname> also supports the SQL-standard 818 notations <type>float</type> and 819 <type>float(<replaceable>p</replaceable>)</type> for specifying 820 inexact numeric types. Here, <replaceable>p</replaceable> specifies 821 the minimum acceptable precision in <emphasis>binary</emphasis> digits. 822 <productname>PostgreSQL</productname> accepts 823 <type>float(1)</type> to <type>float(24)</type> as selecting the 824 <type>real</type> type, while 825 <type>float(25)</type> to <type>float(53)</type> select 826 <type>double precision</type>. Values of <replaceable>p</replaceable> 827 outside the allowed range draw an error. 828 <type>float</type> with no precision specified is taken to mean 829 <type>double precision</type>. 830 </para> 831 832 </sect2> 833 834 <sect2 id="datatype-serial"> 835 <title>Serial Types</title> 836 837 <indexterm zone="datatype-serial"> 838 <primary>smallserial</primary> 839 </indexterm> 840 841 <indexterm zone="datatype-serial"> 842 <primary>serial</primary> 843 </indexterm> 844 845 <indexterm zone="datatype-serial"> 846 <primary>bigserial</primary> 847 </indexterm> 848 849 <indexterm zone="datatype-serial"> 850 <primary>serial2</primary> 851 </indexterm> 852 853 <indexterm zone="datatype-serial"> 854 <primary>serial4</primary> 855 </indexterm> 856 857 <indexterm zone="datatype-serial"> 858 <primary>serial8</primary> 859 </indexterm> 860 861 <indexterm> 862 <primary>auto-increment</primary> 863 <see>serial</see> 864 </indexterm> 865 866 <indexterm> 867 <primary>sequence</primary> 868 <secondary>and serial type</secondary> 869 </indexterm> 870 871 <note> 872 <para> 873 This section describes a PostgreSQL-specific way to create an 874 autoincrementing column. Another way is to use the SQL-standard 875 identity column feature, described at <xref linkend="sql-createtable"/>. 876 </para> 877 </note> 878 879 <para> 880 The data types <type>smallserial</type>, <type>serial</type> and 881 <type>bigserial</type> are not true types, but merely 882 a notational convenience for creating unique identifier columns 883 (similar to the <literal>AUTO_INCREMENT</literal> property 884 supported by some other databases). In the current 885 implementation, specifying: 886 887<programlisting> 888CREATE TABLE <replaceable class="parameter">tablename</replaceable> ( 889 <replaceable class="parameter">colname</replaceable> SERIAL 890); 891</programlisting> 892 893 is equivalent to specifying: 894 895<programlisting> 896CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq AS integer; 897CREATE TABLE <replaceable class="parameter">tablename</replaceable> ( 898 <replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq') 899); 900ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>; 901</programlisting> 902 903 Thus, we have created an integer column and arranged for its default 904 values to be assigned from a sequence generator. A <literal>NOT NULL</literal> 905 constraint is applied to ensure that a null value cannot be 906 inserted. (In most cases you would also want to attach a 907 <literal>UNIQUE</literal> or <literal>PRIMARY KEY</literal> constraint to prevent 908 duplicate values from being inserted by accident, but this is 909 not automatic.) Lastly, the sequence is marked as <quote>owned by</quote> 910 the column, so that it will be dropped if the column or table is dropped. 911 </para> 912 913 <note> 914 <para> 915 Because <type>smallserial</type>, <type>serial</type> and 916 <type>bigserial</type> are implemented using sequences, there may 917 be "holes" or gaps in the sequence of values which appears in the 918 column, even if no rows are ever deleted. A value allocated 919 from the sequence is still "used up" even if a row containing that 920 value is never successfully inserted into the table column. This 921 may happen, for example, if the inserting transaction rolls back. 922 See <literal>nextval()</literal> in <xref linkend="functions-sequence"/> 923 for details. 924 </para> 925 </note> 926 927 <para> 928 To insert the next value of the sequence into the <type>serial</type> 929 column, specify that the <type>serial</type> 930 column should be assigned its default value. This can be done 931 either by excluding the column from the list of columns in 932 the <command>INSERT</command> statement, or through the use of 933 the <literal>DEFAULT</literal> key word. 934 </para> 935 936 <para> 937 The type names <type>serial</type> and <type>serial4</type> are 938 equivalent: both create <type>integer</type> columns. The type 939 names <type>bigserial</type> and <type>serial8</type> work 940 the same way, except that they create a <type>bigint</type> 941 column. <type>bigserial</type> should be used if you anticipate 942 the use of more than 2<superscript>31</superscript> identifiers over the 943 lifetime of the table. The type names <type>smallserial</type> and 944 <type>serial2</type> also work the same way, except that they 945 create a <type>smallint</type> column. 946 </para> 947 948 <para> 949 The sequence created for a <type>serial</type> column is 950 automatically dropped when the owning column is dropped. 951 You can drop the sequence without dropping the column, but this 952 will force removal of the column default expression. 953 </para> 954 </sect2> 955 </sect1> 956 957 <sect1 id="datatype-money"> 958 <title>Monetary Types</title> 959 960 <para> 961 The <type>money</type> type stores a currency amount with a fixed 962 fractional precision; see <xref 963 linkend="datatype-money-table"/>. The fractional precision is 964 determined by the database's <xref linkend="guc-lc-monetary"/> setting. 965 The range shown in the table assumes there are two fractional digits. 966 Input is accepted in a variety of formats, including integer and 967 floating-point literals, as well as typical 968 currency formatting, such as <literal>'$1,000.00'</literal>. 969 Output is generally in the latter form but depends on the locale. 970 </para> 971 972 <table id="datatype-money-table"> 973 <title>Monetary Types</title> 974 <tgroup cols="4"> 975 <colspec colname="col1" colwidth="2*"/> 976 <colspec colname="col2" colwidth="1*"/> 977 <colspec colname="col3" colwidth="2*"/> 978 <colspec colname="col4" colwidth="2*"/> 979 <thead> 980 <row> 981 <entry>Name</entry> 982 <entry>Storage Size</entry> 983 <entry>Description</entry> 984 <entry>Range</entry> 985 </row> 986 </thead> 987 <tbody> 988 <row> 989 <entry><type>money</type></entry> 990 <entry>8 bytes</entry> 991 <entry>currency amount</entry> 992 <entry>-92233720368547758.08 to +92233720368547758.07</entry> 993 </row> 994 </tbody> 995 </tgroup> 996 </table> 997 998 <para> 999 Since the output of this data type is locale-sensitive, it might not 1000 work to load <type>money</type> data into a database that has a different 1001 setting of <varname>lc_monetary</varname>. To avoid problems, before 1002 restoring a dump into a new database make sure <varname>lc_monetary</varname> has 1003 the same or equivalent value as in the database that was dumped. 1004 </para> 1005 1006 <para> 1007 Values of the <type>numeric</type>, <type>int</type>, and 1008 <type>bigint</type> data types can be cast to <type>money</type>. 1009 Conversion from the <type>real</type> and <type>double precision</type> 1010 data types can be done by casting to <type>numeric</type> first, for 1011 example: 1012<programlisting> 1013SELECT '12.34'::float8::numeric::money; 1014</programlisting> 1015 However, this is not recommended. Floating point numbers should not be 1016 used to handle money due to the potential for rounding errors. 1017 </para> 1018 1019 <para> 1020 A <type>money</type> value can be cast to <type>numeric</type> without 1021 loss of precision. Conversion to other types could potentially lose 1022 precision, and must also be done in two stages: 1023<programlisting> 1024SELECT '52093.89'::money::numeric::float8; 1025</programlisting> 1026 </para> 1027 1028 <para> 1029 Division of a <type>money</type> value by an integer value is performed 1030 with truncation of the fractional part towards zero. To get a rounded 1031 result, divide by a floating-point value, or cast the <type>money</type> 1032 value to <type>numeric</type> before dividing and back to <type>money</type> 1033 afterwards. (The latter is preferable to avoid risking precision loss.) 1034 When a <type>money</type> value is divided by another <type>money</type> 1035 value, the result is <type>double precision</type> (i.e., a pure number, 1036 not money); the currency units cancel each other out in the division. 1037 </para> 1038 </sect1> 1039 1040 1041 <sect1 id="datatype-character"> 1042 <title>Character Types</title> 1043 1044 <indexterm zone="datatype-character"> 1045 <primary>character string</primary> 1046 <secondary>data types</secondary> 1047 </indexterm> 1048 1049 <indexterm> 1050 <primary>string</primary> 1051 <see>character string</see> 1052 </indexterm> 1053 1054 <indexterm zone="datatype-character"> 1055 <primary>character</primary> 1056 </indexterm> 1057 1058 <indexterm zone="datatype-character"> 1059 <primary>character varying</primary> 1060 </indexterm> 1061 1062 <indexterm zone="datatype-character"> 1063 <primary>text</primary> 1064 </indexterm> 1065 1066 <indexterm zone="datatype-character"> 1067 <primary>char</primary> 1068 </indexterm> 1069 1070 <indexterm zone="datatype-character"> 1071 <primary>varchar</primary> 1072 </indexterm> 1073 1074 <table id="datatype-character-table"> 1075 <title>Character Types</title> 1076 <tgroup cols="2"> 1077 <thead> 1078 <row> 1079 <entry>Name</entry> 1080 <entry>Description</entry> 1081 </row> 1082 </thead> 1083 <tbody> 1084 <row> 1085 <entry><type>character varying(<replaceable>n</replaceable>)</type>, <type>varchar(<replaceable>n</replaceable>)</type></entry> 1086 <entry>variable-length with limit</entry> 1087 </row> 1088 <row> 1089 <entry><type>character(<replaceable>n</replaceable>)</type>, <type>char(<replaceable>n</replaceable>)</type></entry> 1090 <entry>fixed-length, blank padded</entry> 1091 </row> 1092 <row> 1093 <entry><type>text</type></entry> 1094 <entry>variable unlimited length</entry> 1095 </row> 1096 </tbody> 1097 </tgroup> 1098 </table> 1099 1100 <para> 1101 <xref linkend="datatype-character-table"/> shows the 1102 general-purpose character types available in 1103 <productname>PostgreSQL</productname>. 1104 </para> 1105 1106 <para> 1107 <acronym>SQL</acronym> defines two primary character types: 1108 <type>character varying(<replaceable>n</replaceable>)</type> and 1109 <type>character(<replaceable>n</replaceable>)</type>, where <replaceable>n</replaceable> 1110 is a positive integer. Both of these types can store strings up to 1111 <replaceable>n</replaceable> characters (not bytes) in length. An attempt to store a 1112 longer string into a column of these types will result in an 1113 error, unless the excess characters are all spaces, in which case 1114 the string will be truncated to the maximum length. (This somewhat 1115 bizarre exception is required by the <acronym>SQL</acronym> 1116 standard.) If the string to be stored is shorter than the declared 1117 length, values of type <type>character</type> will be space-padded; 1118 values of type <type>character varying</type> will simply store the 1119 shorter 1120 string. 1121 </para> 1122 1123 <para> 1124 If one explicitly casts a value to <type>character 1125 varying(<replaceable>n</replaceable>)</type> or 1126 <type>character(<replaceable>n</replaceable>)</type>, then an over-length 1127 value will be truncated to <replaceable>n</replaceable> characters without 1128 raising an error. (This too is required by the 1129 <acronym>SQL</acronym> standard.) 1130 </para> 1131 1132 <para> 1133 The notations <type>varchar(<replaceable>n</replaceable>)</type> and 1134 <type>char(<replaceable>n</replaceable>)</type> are aliases for <type>character 1135 varying(<replaceable>n</replaceable>)</type> and 1136 <type>character(<replaceable>n</replaceable>)</type>, respectively. 1137 <type>character</type> without length specifier is equivalent to 1138 <type>character(1)</type>. If <type>character varying</type> is used 1139 without length specifier, the type accepts strings of any size. The 1140 latter is a <productname>PostgreSQL</productname> extension. 1141 </para> 1142 1143 <para> 1144 In addition, <productname>PostgreSQL</productname> provides the 1145 <type>text</type> type, which stores strings of any length. 1146 Although the type <type>text</type> is not in the 1147 <acronym>SQL</acronym> standard, several other SQL database 1148 management systems have it as well. 1149 </para> 1150 1151 <para> 1152 Values of type <type>character</type> are physically padded 1153 with spaces to the specified width <replaceable>n</replaceable>, and are 1154 stored and displayed that way. However, trailing spaces are treated as 1155 semantically insignificant and disregarded when comparing two values 1156 of type <type>character</type>. In collations where whitespace 1157 is significant, this behavior can produce unexpected results; 1158 for example <command>SELECT 'a '::CHAR(2) collate "C" < 1159 E'a\n'::CHAR(2)</command> returns true, even though <literal>C</literal> 1160 locale would consider a space to be greater than a newline. 1161 Trailing spaces are removed when converting a <type>character</type> value 1162 to one of the other string types. Note that trailing spaces 1163 <emphasis>are</emphasis> semantically significant in 1164 <type>character varying</type> and <type>text</type> values, and 1165 when using pattern matching, that is <literal>LIKE</literal> and 1166 regular expressions. 1167 </para> 1168 1169 <para> 1170 The characters that can be stored in any of these data types are 1171 determined by the database character set, which is selected when 1172 the database is created. Regardless of the specific character set, 1173 the character with code zero (sometimes called NUL) cannot be stored. 1174 For more information refer to <xref linkend="multibyte"/>. 1175 </para> 1176 1177 <para> 1178 The storage requirement for a short string (up to 126 bytes) is 1 byte 1179 plus the actual string, which includes the space padding in the case of 1180 <type>character</type>. Longer strings have 4 bytes of overhead instead 1181 of 1. Long strings are compressed by the system automatically, so 1182 the physical requirement on disk might be less. Very long values are also 1183 stored in background tables so that they do not interfere with rapid 1184 access to shorter column values. In any case, the longest 1185 possible character string that can be stored is about 1 GB. (The 1186 maximum value that will be allowed for <replaceable>n</replaceable> in the data 1187 type declaration is less than that. It wouldn't be useful to 1188 change this because with multibyte character encodings the number of 1189 characters and bytes can be quite different. If you desire to 1190 store long strings with no specific upper limit, use 1191 <type>text</type> or <type>character varying</type> without a length 1192 specifier, rather than making up an arbitrary length limit.) 1193 </para> 1194 1195 <tip> 1196 <para> 1197 There is no performance difference among these three types, 1198 apart from increased storage space when using the blank-padded 1199 type, and a few extra CPU cycles to check the length when storing into 1200 a length-constrained column. While 1201 <type>character(<replaceable>n</replaceable>)</type> has performance 1202 advantages in some other database systems, there is no such advantage in 1203 <productname>PostgreSQL</productname>; in fact 1204 <type>character(<replaceable>n</replaceable>)</type> is usually the slowest of 1205 the three because of its additional storage costs. In most situations 1206 <type>text</type> or <type>character varying</type> should be used 1207 instead. 1208 </para> 1209 </tip> 1210 1211 <para> 1212 Refer to <xref linkend="sql-syntax-strings"/> for information about 1213 the syntax of string literals, and to <xref linkend="functions"/> 1214 for information about available operators and functions. 1215 </para> 1216 1217 <example> 1218 <title>Using the Character Types</title> 1219 1220<programlisting> 1221CREATE TABLE test1 (a character(4)); 1222INSERT INTO test1 VALUES ('ok'); 1223SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char"/> 1224<computeroutput> 1225 a | char_length 1226------+------------- 1227 ok | 2 1228</computeroutput> 1229 1230CREATE TABLE test2 (b varchar(5)); 1231INSERT INTO test2 VALUES ('ok'); 1232INSERT INTO test2 VALUES ('good '); 1233INSERT INTO test2 VALUES ('too long'); 1234<computeroutput>ERROR: value too long for type character varying(5)</computeroutput> 1235INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation 1236SELECT b, char_length(b) FROM test2; 1237<computeroutput> 1238 b | char_length 1239-------+------------- 1240 ok | 2 1241 good | 5 1242 too l | 5 1243</computeroutput> 1244</programlisting> 1245 <calloutlist> 1246 <callout arearefs="co.datatype-char"> 1247 <para> 1248 The <function>char_length</function> function is discussed in 1249 <xref linkend="functions-string"/>. 1250 </para> 1251 </callout> 1252 </calloutlist> 1253 </example> 1254 1255 <para> 1256 There are two other fixed-length character types in 1257 <productname>PostgreSQL</productname>, shown in <xref 1258 linkend="datatype-character-special-table"/>. The <type>name</type> 1259 type exists <emphasis>only</emphasis> for the storage of identifiers 1260 in the internal system catalogs and is not intended for use by the general user. Its 1261 length is currently defined as 64 bytes (63 usable characters plus 1262 terminator) but should be referenced using the constant 1263 <symbol>NAMEDATALEN</symbol> in <literal>C</literal> source code. 1264 The length is set at compile time (and 1265 is therefore adjustable for special uses); the default maximum 1266 length might change in a future release. The type <type>"char"</type> 1267 (note the quotes) is different from <type>char(1)</type> in that it 1268 only uses one byte of storage. It is internally used in the system 1269 catalogs as a simplistic enumeration type. 1270 </para> 1271 1272 <table id="datatype-character-special-table"> 1273 <title>Special Character Types</title> 1274 <tgroup cols="3"> 1275 <thead> 1276 <row> 1277 <entry>Name</entry> 1278 <entry>Storage Size</entry> 1279 <entry>Description</entry> 1280 </row> 1281 </thead> 1282 <tbody> 1283 <row> 1284 <entry><type>"char"</type></entry> 1285 <entry>1 byte</entry> 1286 <entry>single-byte internal type</entry> 1287 </row> 1288 <row> 1289 <entry><type>name</type></entry> 1290 <entry>64 bytes</entry> 1291 <entry>internal type for object names</entry> 1292 </row> 1293 </tbody> 1294 </tgroup> 1295 </table> 1296 1297 </sect1> 1298 1299 <sect1 id="datatype-binary"> 1300 <title>Binary Data Types</title> 1301 1302 <indexterm zone="datatype-binary"> 1303 <primary>binary data</primary> 1304 </indexterm> 1305 1306 <indexterm zone="datatype-binary"> 1307 <primary>bytea</primary> 1308 </indexterm> 1309 1310 <para> 1311 The <type>bytea</type> data type allows storage of binary strings; 1312 see <xref linkend="datatype-binary-table"/>. 1313 </para> 1314 1315 <table id="datatype-binary-table"> 1316 <title>Binary Data Types</title> 1317 <tgroup cols="3"> 1318 <colspec colname="col1" colwidth="1*"/> 1319 <colspec colname="col2" colwidth="3*"/> 1320 <colspec colname="col3" colwidth="2*"/> 1321 <thead> 1322 <row> 1323 <entry>Name</entry> 1324 <entry>Storage Size</entry> 1325 <entry>Description</entry> 1326 </row> 1327 </thead> 1328 <tbody> 1329 <row> 1330 <entry><type>bytea</type></entry> 1331 <entry>1 or 4 bytes plus the actual binary string</entry> 1332 <entry>variable-length binary string</entry> 1333 </row> 1334 </tbody> 1335 </tgroup> 1336 </table> 1337 1338 <para> 1339 A binary string is a sequence of octets (or bytes). Binary 1340 strings are distinguished from character strings in two 1341 ways. First, binary strings specifically allow storing 1342 octets of value zero and other <quote>non-printable</quote> 1343 octets (usually, octets outside the decimal range 32 to 126). 1344 Character strings disallow zero octets, and also disallow any 1345 other octet values and sequences of octet values that are invalid 1346 according to the database's selected character set encoding. 1347 Second, operations on binary strings process the actual bytes, 1348 whereas the processing of character strings depends on locale settings. 1349 In short, binary strings are appropriate for storing data that the 1350 programmer thinks of as <quote>raw bytes</quote>, whereas character 1351 strings are appropriate for storing text. 1352 </para> 1353 1354 <para> 1355 The <type>bytea</type> type supports two 1356 formats for input and output: <quote>hex</quote> format 1357 and <productname>PostgreSQL</productname>'s historical 1358 <quote>escape</quote> format. Both 1359 of these are always accepted on input. The output format depends 1360 on the configuration parameter <xref linkend="guc-bytea-output"/>; 1361 the default is hex. (Note that the hex format was introduced in 1362 <productname>PostgreSQL</productname> 9.0; earlier versions and some 1363 tools don't understand it.) 1364 </para> 1365 1366 <para> 1367 The <acronym>SQL</acronym> standard defines a different binary 1368 string type, called <type>BLOB</type> or <type>BINARY LARGE 1369 OBJECT</type>. The input format is different from 1370 <type>bytea</type>, but the provided functions and operators are 1371 mostly the same. 1372 </para> 1373 1374 <sect2> 1375 <title><type>bytea</type> Hex Format</title> 1376 1377 <para> 1378 The <quote>hex</quote> format encodes binary data as 2 hexadecimal digits 1379 per byte, most significant nibble first. The entire string is 1380 preceded by the sequence <literal>\x</literal> (to distinguish it 1381 from the escape format). In some contexts, the initial backslash may 1382 need to be escaped by doubling it 1383 (see <xref linkend="sql-syntax-strings"/>). 1384 For input, the hexadecimal digits can 1385 be either upper or lower case, and whitespace is permitted between 1386 digit pairs (but not within a digit pair nor in the starting 1387 <literal>\x</literal> sequence). 1388 The hex format is compatible with a wide 1389 range of external applications and protocols, and it tends to be 1390 faster to convert than the escape format, so its use is preferred. 1391 </para> 1392 1393 <para> 1394 Example: 1395<programlisting> 1396SELECT '\xDEADBEEF'; 1397</programlisting> 1398 </para> 1399 </sect2> 1400 1401 <sect2> 1402 <title><type>bytea</type> Escape Format</title> 1403 1404 <para> 1405 The <quote>escape</quote> format is the traditional 1406 <productname>PostgreSQL</productname> format for the <type>bytea</type> 1407 type. It 1408 takes the approach of representing a binary string as a sequence 1409 of ASCII characters, while converting those bytes that cannot be 1410 represented as an ASCII character into special escape sequences. 1411 If, from the point of view of the application, representing bytes 1412 as characters makes sense, then this representation can be 1413 convenient. But in practice it is usually confusing because it 1414 fuzzes up the distinction between binary strings and character 1415 strings, and also the particular escape mechanism that was chosen is 1416 somewhat unwieldy. Therefore, this format should probably be avoided 1417 for most new applications. 1418 </para> 1419 1420 <para> 1421 When entering <type>bytea</type> values in escape format, 1422 octets of certain 1423 values <emphasis>must</emphasis> be escaped, while all octet 1424 values <emphasis>can</emphasis> be escaped. In 1425 general, to escape an octet, convert it into its three-digit 1426 octal value and precede it by a backslash. 1427 Backslash itself (octet decimal value 92) can alternatively be represented by 1428 double backslashes. 1429 <xref linkend="datatype-binary-sqlesc"/> 1430 shows the characters that must be escaped, and gives the alternative 1431 escape sequences where applicable. 1432 </para> 1433 1434 <table id="datatype-binary-sqlesc"> 1435 <title><type>bytea</type> Literal Escaped Octets</title> 1436 <tgroup cols="5"> 1437 <colspec colname="col1" colwidth="1*"/> 1438 <colspec colname="col2" colwidth="1*"/> 1439 <colspec colname="col3" colwidth="1*"/> 1440 <colspec colname="col4" colwidth="1.25*"/> 1441 <colspec colname="col5" colwidth="1*"/> 1442 <thead> 1443 <row> 1444 <entry>Decimal Octet Value</entry> 1445 <entry>Description</entry> 1446 <entry>Escaped Input Representation</entry> 1447 <entry>Example</entry> 1448 <entry>Hex Representation</entry> 1449 </row> 1450 </thead> 1451 1452 <tbody> 1453 <row> 1454 <entry>0</entry> 1455 <entry>zero octet</entry> 1456 <entry><literal>'\000'</literal></entry> 1457 <entry><literal>'\000'::bytea</literal></entry> 1458 <entry><literal>\x00</literal></entry> 1459 </row> 1460 1461 <row> 1462 <entry>39</entry> 1463 <entry>single quote</entry> 1464 <entry><literal>''''</literal> or <literal>'\047'</literal></entry> 1465 <entry><literal>''''::bytea</literal></entry> 1466 <entry><literal>\x27</literal></entry> 1467 </row> 1468 1469 <row> 1470 <entry>92</entry> 1471 <entry>backslash</entry> 1472 <entry><literal>'\\'</literal> or <literal>'\134'</literal></entry> 1473 <entry><literal>'\\'::bytea</literal></entry> 1474 <entry><literal>\x5c</literal></entry> 1475 </row> 1476 1477 <row> 1478 <entry>0 to 31 and 127 to 255</entry> 1479 <entry><quote>non-printable</quote> octets</entry> 1480 <entry><literal>'\<replaceable>xxx'</replaceable></literal> (octal value)</entry> 1481 <entry><literal>'\001'::bytea</literal></entry> 1482 <entry><literal>\x01</literal></entry> 1483 </row> 1484 1485 </tbody> 1486 </tgroup> 1487 </table> 1488 1489 <para> 1490 The requirement to escape <emphasis>non-printable</emphasis> octets 1491 varies depending on locale settings. In some instances you can get away 1492 with leaving them unescaped. 1493 </para> 1494 1495 <para> 1496 The reason that single quotes must be doubled, as shown 1497 in <xref linkend="datatype-binary-sqlesc"/>, is that this 1498 is true for any string literal in a SQL command. The generic 1499 string-literal parser consumes the outermost single quotes 1500 and reduces any pair of single quotes to one data character. 1501 What the <type>bytea</type> input function sees is just one 1502 single quote, which it treats as a plain data character. 1503 However, the <type>bytea</type> input function treats 1504 backslashes as special, and the other behaviors shown in 1505 <xref linkend="datatype-binary-sqlesc"/> are implemented by 1506 that function. 1507 </para> 1508 1509 <para> 1510 In some contexts, backslashes must be doubled compared to what is 1511 shown above, because the generic string-literal parser will also 1512 reduce pairs of backslashes to one data character; 1513 see <xref linkend="sql-syntax-strings"/>. 1514 </para> 1515 1516 <para> 1517 <type>Bytea</type> octets are output in <literal>hex</literal> 1518 format by default. If you change <xref linkend="guc-bytea-output"/> 1519 to <literal>escape</literal>, 1520 <quote>non-printable</quote> octets are converted to their 1521 equivalent three-digit octal value and preceded by one backslash. 1522 Most <quote>printable</quote> octets are output by their standard 1523 representation in the client character set, e.g.: 1524 1525<programlisting> 1526SET bytea_output = 'escape'; 1527 1528SELECT 'abc \153\154\155 \052\251\124'::bytea; 1529 bytea 1530---------------- 1531 abc klm *\251T 1532</programlisting> 1533 1534 The octet with decimal value 92 (backslash) is doubled in the output. 1535 Details are in <xref linkend="datatype-binary-resesc"/>. 1536 </para> 1537 1538 <table id="datatype-binary-resesc"> 1539 <title><type>bytea</type> Output Escaped Octets</title> 1540 <tgroup cols="5"> 1541 <colspec colname="col1" colwidth="1*"/> 1542 <colspec colname="col2" colwidth="1*"/> 1543 <colspec colname="col3" colwidth="1*"/> 1544 <colspec colname="col4" colwidth="1.25*"/> 1545 <colspec colname="col5" colwidth="1*"/> 1546 <thead> 1547 <row> 1548 <entry>Decimal Octet Value</entry> 1549 <entry>Description</entry> 1550 <entry>Escaped Output Representation</entry> 1551 <entry>Example</entry> 1552 <entry>Output Result</entry> 1553 </row> 1554 </thead> 1555 1556 <tbody> 1557 1558 <row> 1559 <entry>92</entry> 1560 <entry>backslash</entry> 1561 <entry><literal>\\</literal></entry> 1562 <entry><literal>'\134'::bytea</literal></entry> 1563 <entry><literal>\\</literal></entry> 1564 </row> 1565 1566 <row> 1567 <entry>0 to 31 and 127 to 255</entry> 1568 <entry><quote>non-printable</quote> octets</entry> 1569 <entry><literal>\<replaceable>xxx</replaceable></literal> (octal value)</entry> 1570 <entry><literal>'\001'::bytea</literal></entry> 1571 <entry><literal>\001</literal></entry> 1572 </row> 1573 1574 <row> 1575 <entry>32 to 126</entry> 1576 <entry><quote>printable</quote> octets</entry> 1577 <entry>client character set representation</entry> 1578 <entry><literal>'\176'::bytea</literal></entry> 1579 <entry><literal>~</literal></entry> 1580 </row> 1581 1582 </tbody> 1583 </tgroup> 1584 </table> 1585 1586 <para> 1587 Depending on the front end to <productname>PostgreSQL</productname> you use, 1588 you might have additional work to do in terms of escaping and 1589 unescaping <type>bytea</type> strings. For example, you might also 1590 have to escape line feeds and carriage returns if your interface 1591 automatically translates these. 1592 </para> 1593 </sect2> 1594 </sect1> 1595 1596 1597 <sect1 id="datatype-datetime"> 1598 <title>Date/Time Types</title> 1599 1600 <indexterm zone="datatype-datetime"> 1601 <primary>date</primary> 1602 </indexterm> 1603 <indexterm zone="datatype-datetime"> 1604 <primary>time</primary> 1605 </indexterm> 1606 <indexterm zone="datatype-datetime"> 1607 <primary>time without time zone</primary> 1608 </indexterm> 1609 <indexterm zone="datatype-datetime"> 1610 <primary>time with time zone</primary> 1611 </indexterm> 1612 <indexterm zone="datatype-datetime"> 1613 <primary>timestamp</primary> 1614 </indexterm> 1615 <indexterm zone="datatype-datetime"> 1616 <primary>timestamptz</primary> 1617 </indexterm> 1618 <indexterm zone="datatype-datetime"> 1619 <primary>timestamp with time zone</primary> 1620 </indexterm> 1621 <indexterm zone="datatype-datetime"> 1622 <primary>timestamp without time zone</primary> 1623 </indexterm> 1624 <indexterm zone="datatype-datetime"> 1625 <primary>interval</primary> 1626 </indexterm> 1627 <indexterm zone="datatype-datetime"> 1628 <primary>time span</primary> 1629 </indexterm> 1630 1631 <para> 1632 <productname>PostgreSQL</productname> supports the full set of 1633 <acronym>SQL</acronym> date and time types, shown in <xref 1634 linkend="datatype-datetime-table"/>. The operations available 1635 on these data types are described in 1636 <xref linkend="functions-datetime"/>. 1637 Dates are counted according to the Gregorian calendar, even in 1638 years before that calendar was introduced (see <xref 1639 linkend="datetime-units-history"/> for more information). 1640 </para> 1641 1642 <table id="datatype-datetime-table"> 1643 <title>Date/Time Types</title> 1644 <tgroup cols="6"> 1645 <thead> 1646 <row> 1647 <entry>Name</entry> 1648 <entry>Storage Size</entry> 1649 <entry>Description</entry> 1650 <entry>Low Value</entry> 1651 <entry>High Value</entry> 1652 <entry>Resolution</entry> 1653 </row> 1654 </thead> 1655 <tbody> 1656 <row> 1657 <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry> 1658 <entry>8 bytes</entry> 1659 <entry>both date and time (no time zone)</entry> 1660 <entry>4713 BC</entry> 1661 <entry>294276 AD</entry> 1662 <entry>1 microsecond</entry> 1663 </row> 1664 <row> 1665 <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry> 1666 <entry>8 bytes</entry> 1667 <entry>both date and time, with time zone</entry> 1668 <entry>4713 BC</entry> 1669 <entry>294276 AD</entry> 1670 <entry>1 microsecond</entry> 1671 </row> 1672 <row> 1673 <entry><type>date</type></entry> 1674 <entry>4 bytes</entry> 1675 <entry>date (no time of day)</entry> 1676 <entry>4713 BC</entry> 1677 <entry>5874897 AD</entry> 1678 <entry>1 day</entry> 1679 </row> 1680 <row> 1681 <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry> 1682 <entry>8 bytes</entry> 1683 <entry>time of day (no date)</entry> 1684 <entry>00:00:00</entry> 1685 <entry>24:00:00</entry> 1686 <entry>1 microsecond</entry> 1687 </row> 1688 <row> 1689 <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry> 1690 <entry>12 bytes</entry> 1691 <entry>time of day (no date), with time zone</entry> 1692 <!-- see MAX_TZDISP_HOUR in datatype/timestamp.h --> 1693 <entry>00:00:00+1559</entry> 1694 <entry>24:00:00-1559</entry> 1695 <entry>1 microsecond</entry> 1696 </row> 1697 <row> 1698 <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry> 1699 <entry>16 bytes</entry> 1700 <entry>time interval</entry> 1701 <entry>-178000000 years</entry> 1702 <entry>178000000 years</entry> 1703 <entry>1 microsecond</entry> 1704 </row> 1705 </tbody> 1706 </tgroup> 1707 </table> 1708 1709 <note> 1710 <para> 1711 The SQL standard requires that writing just <type>timestamp</type> 1712 be equivalent to <type>timestamp without time 1713 zone</type>, and <productname>PostgreSQL</productname> honors that 1714 behavior. <type>timestamptz</type> is accepted as an 1715 abbreviation for <type>timestamp with time zone</type>; this is a 1716 <productname>PostgreSQL</productname> extension. 1717 </para> 1718 </note> 1719 1720 <para> 1721 <type>time</type>, <type>timestamp</type>, and 1722 <type>interval</type> accept an optional precision value 1723 <replaceable>p</replaceable> which specifies the number of 1724 fractional digits retained in the seconds field. By default, there 1725 is no explicit bound on precision. The allowed range of 1726 <replaceable>p</replaceable> is from 0 to 6. 1727 </para> 1728 1729 <para> 1730 The <type>interval</type> type has an additional option, which is 1731 to restrict the set of stored fields by writing one of these phrases: 1732<literallayout class="monospaced"> 1733YEAR 1734MONTH 1735DAY 1736HOUR 1737MINUTE 1738SECOND 1739YEAR TO MONTH 1740DAY TO HOUR 1741DAY TO MINUTE 1742DAY TO SECOND 1743HOUR TO MINUTE 1744HOUR TO SECOND 1745MINUTE TO SECOND 1746</literallayout> 1747 Note that if both <replaceable>fields</replaceable> and 1748 <replaceable>p</replaceable> are specified, the 1749 <replaceable>fields</replaceable> must include <literal>SECOND</literal>, 1750 since the precision applies only to the seconds. 1751 </para> 1752 1753 <para> 1754 The type <type>time with time zone</type> is defined by the SQL 1755 standard, but the definition exhibits properties which lead to 1756 questionable usefulness. In most cases, a combination of 1757 <type>date</type>, <type>time</type>, <type>timestamp without time 1758 zone</type>, and <type>timestamp with time zone</type> should 1759 provide a complete range of date/time functionality required by 1760 any application. 1761 </para> 1762 1763 <sect2 id="datatype-datetime-input"> 1764 <title>Date/Time Input</title> 1765 1766 <para> 1767 Date and time input is accepted in almost any reasonable format, including 1768 ISO 8601, <acronym>SQL</acronym>-compatible, 1769 traditional <productname>POSTGRES</productname>, and others. 1770 For some formats, ordering of day, month, and year in date input is 1771 ambiguous and there is support for specifying the expected 1772 ordering of these fields. Set the <xref linkend="guc-datestyle"/> parameter 1773 to <literal>MDY</literal> to select month-day-year interpretation, 1774 <literal>DMY</literal> to select day-month-year interpretation, or 1775 <literal>YMD</literal> to select year-month-day interpretation. 1776 </para> 1777 1778 <para> 1779 <productname>PostgreSQL</productname> is more flexible in 1780 handling date/time input than the 1781 <acronym>SQL</acronym> standard requires. 1782 See <xref linkend="datetime-appendix"/> 1783 for the exact parsing rules of date/time input and for the 1784 recognized text fields including months, days of the week, and 1785 time zones. 1786 </para> 1787 1788 <para> 1789 Remember that any date or time literal input needs to be enclosed 1790 in single quotes, like text strings. Refer to 1791 <xref linkend="sql-syntax-constants-generic"/> for more 1792 information. 1793 <acronym>SQL</acronym> requires the following syntax 1794<synopsis> 1795<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>' 1796</synopsis> 1797 where <replaceable>p</replaceable> is an optional precision 1798 specification giving the number of 1799 fractional digits in the seconds field. Precision can be 1800 specified for <type>time</type>, <type>timestamp</type>, and 1801 <type>interval</type> types, and can range from 0 to 6. 1802 If no precision is specified in a constant specification, 1803 it defaults to the precision of the literal value (but not 1804 more than 6 digits). 1805 </para> 1806 1807 <sect3> 1808 <title>Dates</title> 1809 1810 <indexterm> 1811 <primary>date</primary> 1812 </indexterm> 1813 1814 <para> 1815 <xref linkend="datatype-datetime-date-table"/> shows some possible 1816 inputs for the <type>date</type> type. 1817 </para> 1818 1819 <table id="datatype-datetime-date-table"> 1820 <title>Date Input</title> 1821 <tgroup cols="2"> 1822 <colspec colname="col1" colwidth="1*"/> 1823 <colspec colname="col2" colwidth="2*"/> 1824 <thead> 1825 <row> 1826 <entry>Example</entry> 1827 <entry>Description</entry> 1828 </row> 1829 </thead> 1830 <tbody> 1831 <row> 1832 <entry>1999-01-08</entry> 1833 <entry>ISO 8601; January 8 in any mode 1834 (recommended format)</entry> 1835 </row> 1836 <row> 1837 <entry>January 8, 1999</entry> 1838 <entry>unambiguous in any <varname>datestyle</varname> input mode</entry> 1839 </row> 1840 <row> 1841 <entry>1/8/1999</entry> 1842 <entry>January 8 in <literal>MDY</literal> mode; 1843 August 1 in <literal>DMY</literal> mode</entry> 1844 </row> 1845 <row> 1846 <entry>1/18/1999</entry> 1847 <entry>January 18 in <literal>MDY</literal> mode; 1848 rejected in other modes</entry> 1849 </row> 1850 <row> 1851 <entry>01/02/03</entry> 1852 <entry>January 2, 2003 in <literal>MDY</literal> mode; 1853 February 1, 2003 in <literal>DMY</literal> mode; 1854 February 3, 2001 in <literal>YMD</literal> mode 1855 </entry> 1856 </row> 1857 <row> 1858 <entry>1999-Jan-08</entry> 1859 <entry>January 8 in any mode</entry> 1860 </row> 1861 <row> 1862 <entry>Jan-08-1999</entry> 1863 <entry>January 8 in any mode</entry> 1864 </row> 1865 <row> 1866 <entry>08-Jan-1999</entry> 1867 <entry>January 8 in any mode</entry> 1868 </row> 1869 <row> 1870 <entry>99-Jan-08</entry> 1871 <entry>January 8 in <literal>YMD</literal> mode, else error</entry> 1872 </row> 1873 <row> 1874 <entry>08-Jan-99</entry> 1875 <entry>January 8, except error in <literal>YMD</literal> mode</entry> 1876 </row> 1877 <row> 1878 <entry>Jan-08-99</entry> 1879 <entry>January 8, except error in <literal>YMD</literal> mode</entry> 1880 </row> 1881 <row> 1882 <entry>19990108</entry> 1883 <entry>ISO 8601; January 8, 1999 in any mode</entry> 1884 </row> 1885 <row> 1886 <entry>990108</entry> 1887 <entry>ISO 8601; January 8, 1999 in any mode</entry> 1888 </row> 1889 <row> 1890 <entry>1999.008</entry> 1891 <entry>year and day of year</entry> 1892 </row> 1893 <row> 1894 <entry>J2451187</entry> 1895 <entry>Julian date</entry> 1896 </row> 1897 <row> 1898 <entry>January 8, 99 BC</entry> 1899 <entry>year 99 BC</entry> 1900 </row> 1901 </tbody> 1902 </tgroup> 1903 </table> 1904 </sect3> 1905 1906 <sect3> 1907 <title>Times</title> 1908 1909 <indexterm> 1910 <primary>time</primary> 1911 </indexterm> 1912 <indexterm> 1913 <primary>time without time zone</primary> 1914 </indexterm> 1915 <indexterm> 1916 <primary>time with time zone</primary> 1917 </indexterm> 1918 1919 <para> 1920 The time-of-day types are <type>time [ 1921 (<replaceable>p</replaceable>) ] without time zone</type> and 1922 <type>time [ (<replaceable>p</replaceable>) ] with time 1923 zone</type>. <type>time</type> alone is equivalent to 1924 <type>time without time zone</type>. 1925 </para> 1926 1927 <para> 1928 Valid input for these types consists of a time of day followed 1929 by an optional time zone. (See <xref 1930 linkend="datatype-datetime-time-table"/> 1931 and <xref linkend="datatype-timezone-table"/>.) If a time zone is 1932 specified in the input for <type>time without time zone</type>, 1933 it is silently ignored. You can also specify a date but it will 1934 be ignored, except when you use a time zone name that involves a 1935 daylight-savings rule, such as 1936 <literal>America/New_York</literal>. In this case specifying the date 1937 is required in order to determine whether standard or daylight-savings 1938 time applies. The appropriate time zone offset is recorded in the 1939 <type>time with time zone</type> value. 1940 </para> 1941 1942 <table id="datatype-datetime-time-table"> 1943 <title>Time Input</title> 1944 <tgroup cols="2"> 1945 <colspec colname="col1" colwidth="3*"/> 1946 <colspec colname="col2" colwidth="2*"/> 1947 <thead> 1948 <row> 1949 <entry>Example</entry> 1950 <entry>Description</entry> 1951 </row> 1952 </thead> 1953 <tbody> 1954 <row> 1955 <entry><literal>04:05:06.789</literal></entry> 1956 <entry>ISO 8601</entry> 1957 </row> 1958 <row> 1959 <entry><literal>04:05:06</literal></entry> 1960 <entry>ISO 8601</entry> 1961 </row> 1962 <row> 1963 <entry><literal>04:05</literal></entry> 1964 <entry>ISO 8601</entry> 1965 </row> 1966 <row> 1967 <entry><literal>040506</literal></entry> 1968 <entry>ISO 8601</entry> 1969 </row> 1970 <row> 1971 <entry><literal>04:05 AM</literal></entry> 1972 <entry>same as 04:05; AM does not affect value</entry> 1973 </row> 1974 <row> 1975 <entry><literal>04:05 PM</literal></entry> 1976 <entry>same as 16:05; input hour must be <= 12</entry> 1977 </row> 1978 <row> 1979 <entry><literal>04:05:06.789-8</literal></entry> 1980 <entry>ISO 8601, with time zone as UTC offset</entry> 1981 </row> 1982 <row> 1983 <entry><literal>04:05:06-08:00</literal></entry> 1984 <entry>ISO 8601, with time zone as UTC offset</entry> 1985 </row> 1986 <row> 1987 <entry><literal>04:05-08:00</literal></entry> 1988 <entry>ISO 8601, with time zone as UTC offset</entry> 1989 </row> 1990 <row> 1991 <entry><literal>040506-08</literal></entry> 1992 <entry>ISO 8601, with time zone as UTC offset</entry> 1993 </row> 1994 <row> 1995 <entry><literal>040506+0730</literal></entry> 1996 <entry>ISO 8601, with fractional-hour time zone as UTC offset</entry> 1997 </row> 1998 <row> 1999 <entry><literal>040506+07:30:00</literal></entry> 2000 <entry>UTC offset specified to seconds (not allowed in ISO 8601)</entry> 2001 </row> 2002 <row> 2003 <entry><literal>04:05:06 PST</literal></entry> 2004 <entry>time zone specified by abbreviation</entry> 2005 </row> 2006 <row> 2007 <entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry> 2008 <entry>time zone specified by full name</entry> 2009 </row> 2010 </tbody> 2011 </tgroup> 2012 </table> 2013 2014 <table tocentry="1" id="datatype-timezone-table"> 2015 <title>Time Zone Input</title> 2016 <tgroup cols="2"> 2017 <thead> 2018 <row> 2019 <entry>Example</entry> 2020 <entry>Description</entry> 2021 </row> 2022 </thead> 2023 <tbody> 2024 <row> 2025 <entry><literal>PST</literal></entry> 2026 <entry>Abbreviation (for Pacific Standard Time)</entry> 2027 </row> 2028 <row> 2029 <entry><literal>America/New_York</literal></entry> 2030 <entry>Full time zone name</entry> 2031 </row> 2032 <row> 2033 <entry><literal>PST8PDT</literal></entry> 2034 <entry>POSIX-style time zone specification</entry> 2035 </row> 2036 <row> 2037 <entry><literal>-8:00:00</literal></entry> 2038 <entry>UTC offset for PST</entry> 2039 </row> 2040 <row> 2041 <entry><literal>-8:00</literal></entry> 2042 <entry>UTC offset for PST (ISO 8601 extended format)</entry> 2043 </row> 2044 <row> 2045 <entry><literal>-800</literal></entry> 2046 <entry>UTC offset for PST (ISO 8601 basic format)</entry> 2047 </row> 2048 <row> 2049 <entry><literal>-8</literal></entry> 2050 <entry>UTC offset for PST (ISO 8601 basic format)</entry> 2051 </row> 2052 <row> 2053 <entry><literal>zulu</literal></entry> 2054 <entry>Military abbreviation for UTC</entry> 2055 </row> 2056 <row> 2057 <entry><literal>z</literal></entry> 2058 <entry>Short form of <literal>zulu</literal> (also in ISO 8601)</entry> 2059 </row> 2060 </tbody> 2061 </tgroup> 2062 </table> 2063 2064 <para> 2065 Refer to <xref linkend="datatype-timezones"/> for more information on how 2066 to specify time zones. 2067 </para> 2068 </sect3> 2069 2070 <sect3> 2071 <title>Time Stamps</title> 2072 2073 <indexterm> 2074 <primary>timestamp</primary> 2075 </indexterm> 2076 2077 <indexterm> 2078 <primary>timestamp with time zone</primary> 2079 </indexterm> 2080 2081 <indexterm> 2082 <primary>timestamp without time zone</primary> 2083 </indexterm> 2084 2085 <para> 2086 Valid input for the time stamp types consists of the concatenation 2087 of a date and a time, followed by an optional time zone, 2088 followed by an optional <literal>AD</literal> or <literal>BC</literal>. 2089 (Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear 2090 before the time zone, but this is not the preferred ordering.) 2091 Thus: 2092 2093<programlisting> 20941999-01-08 04:05:06 2095</programlisting> 2096 and: 2097<programlisting> 20981999-01-08 04:05:06 -8:00 2099</programlisting> 2100 2101 are valid values, which follow the <acronym>ISO</acronym> 8601 2102 standard. In addition, the common format: 2103<programlisting> 2104January 8 04:05:06 1999 PST 2105</programlisting> 2106 is supported. 2107 </para> 2108 2109 <para> 2110 The <acronym>SQL</acronym> standard differentiates 2111 <type>timestamp without time zone</type> 2112 and <type>timestamp with time zone</type> literals by the presence of a 2113 <quote>+</quote> or <quote>-</quote> symbol and time zone offset after 2114 the time. Hence, according to the standard, 2115 2116<programlisting> 2117TIMESTAMP '2004-10-19 10:23:54' 2118</programlisting> 2119 2120 is a <type>timestamp without time zone</type>, while 2121 2122<programlisting> 2123TIMESTAMP '2004-10-19 10:23:54+02' 2124</programlisting> 2125 2126 is a <type>timestamp with time zone</type>. 2127 <productname>PostgreSQL</productname> never examines the content of a 2128 literal string before determining its type, and therefore will treat 2129 both of the above as <type>timestamp without time zone</type>. To 2130 ensure that a literal is treated as <type>timestamp with time 2131 zone</type>, give it the correct explicit type: 2132 2133<programlisting> 2134TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02' 2135</programlisting> 2136 2137 In a literal that has been determined to be <type>timestamp without time 2138 zone</type>, <productname>PostgreSQL</productname> will silently ignore 2139 any time zone indication. 2140 That is, the resulting value is derived from the date/time 2141 fields in the input value, and is not adjusted for time zone. 2142 </para> 2143 2144 <para> 2145 For <type>timestamp with time zone</type>, the internally stored 2146 value is always in UTC (Universal 2147 Coordinated Time, traditionally known as Greenwich Mean Time, 2148 <acronym>GMT</acronym>). An input value that has an explicit 2149 time zone specified is converted to UTC using the appropriate offset 2150 for that time zone. If no time zone is stated in the input string, 2151 then it is assumed to be in the time zone indicated by the system's 2152 <xref linkend="guc-timezone"/> parameter, and is converted to UTC using the 2153 offset for the <varname>timezone</varname> zone. 2154 </para> 2155 2156 <para> 2157 When a <type>timestamp with time 2158 zone</type> value is output, it is always converted from UTC to the 2159 current <varname>timezone</varname> zone, and displayed as local time in that 2160 zone. To see the time in another time zone, either change 2161 <varname>timezone</varname> or use the <literal>AT TIME ZONE</literal> construct 2162 (see <xref linkend="functions-datetime-zoneconvert"/>). 2163 </para> 2164 2165 <para> 2166 Conversions between <type>timestamp without time zone</type> and 2167 <type>timestamp with time zone</type> normally assume that the 2168 <type>timestamp without time zone</type> value should be taken or given 2169 as <varname>timezone</varname> local time. A different time zone can 2170 be specified for the conversion using <literal>AT TIME ZONE</literal>. 2171 </para> 2172 </sect3> 2173 2174 <sect3 id="datatype-datetime-special-values"> 2175 <title>Special Values</title> 2176 2177 <indexterm> 2178 <primary>time</primary> 2179 <secondary>constants</secondary> 2180 </indexterm> 2181 2182 <indexterm> 2183 <primary>date</primary> 2184 <secondary>constants</secondary> 2185 </indexterm> 2186 2187 <para> 2188 <productname>PostgreSQL</productname> supports several 2189 special date/time input values for convenience, as shown in <xref 2190 linkend="datatype-datetime-special-table"/>. The values 2191 <literal>infinity</literal> and <literal>-infinity</literal> 2192 are specially represented inside the system and will be displayed 2193 unchanged; but the others are simply notational shorthands 2194 that will be converted to ordinary date/time values when read. 2195 (In particular, <literal>now</literal> and related strings are converted 2196 to a specific time value as soon as they are read.) 2197 All of these values need to be enclosed in single quotes when used 2198 as constants in SQL commands. 2199 </para> 2200 2201 <table id="datatype-datetime-special-table"> 2202 <title>Special Date/Time Inputs</title> 2203 <tgroup cols="3"> 2204 <thead> 2205 <row> 2206 <entry>Input String</entry> 2207 <entry>Valid Types</entry> 2208 <entry>Description</entry> 2209 </row> 2210 </thead> 2211 <tbody> 2212 <row> 2213 <entry><literal>epoch</literal></entry> 2214 <entry><type>date</type>, <type>timestamp</type></entry> 2215 <entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry> 2216 </row> 2217 <row> 2218 <entry><literal>infinity</literal></entry> 2219 <entry><type>date</type>, <type>timestamp</type></entry> 2220 <entry>later than all other time stamps</entry> 2221 </row> 2222 <row> 2223 <entry><literal>-infinity</literal></entry> 2224 <entry><type>date</type>, <type>timestamp</type></entry> 2225 <entry>earlier than all other time stamps</entry> 2226 </row> 2227 <row> 2228 <entry><literal>now</literal></entry> 2229 <entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry> 2230 <entry>current transaction's start time</entry> 2231 </row> 2232 <row> 2233 <entry><literal>today</literal></entry> 2234 <entry><type>date</type>, <type>timestamp</type></entry> 2235 <entry>midnight (<literal>00:00</literal>) today</entry> 2236 </row> 2237 <row> 2238 <entry><literal>tomorrow</literal></entry> 2239 <entry><type>date</type>, <type>timestamp</type></entry> 2240 <entry>midnight (<literal>00:00</literal>) tomorrow</entry> 2241 </row> 2242 <row> 2243 <entry><literal>yesterday</literal></entry> 2244 <entry><type>date</type>, <type>timestamp</type></entry> 2245 <entry>midnight (<literal>00:00</literal>) yesterday</entry> 2246 </row> 2247 <row> 2248 <entry><literal>allballs</literal></entry> 2249 <entry><type>time</type></entry> 2250 <entry>00:00:00.00 UTC</entry> 2251 </row> 2252 </tbody> 2253 </tgroup> 2254 </table> 2255 2256 <para> 2257 The following <acronym>SQL</acronym>-compatible functions can also 2258 be used to obtain the current time value for the corresponding data 2259 type: 2260 <literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>, 2261 <literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>, 2262 <literal>LOCALTIMESTAMP</literal>. (See <xref 2263 linkend="functions-datetime-current"/>.) Note that these are 2264 SQL functions and are <emphasis>not</emphasis> recognized in data input strings. 2265 </para> 2266 2267 <caution> 2268 <para> 2269 While the input strings <literal>now</literal>, 2270 <literal>today</literal>, <literal>tomorrow</literal>, 2271 and <literal>yesterday</literal> are fine to use in interactive SQL 2272 commands, they can have surprising behavior when the command is 2273 saved to be executed later, for example in prepared statements, 2274 views, and function definitions. The string can be converted to a 2275 specific time value that continues to be used long after it becomes 2276 stale. Use one of the SQL functions instead in such contexts. 2277 For example, <literal>CURRENT_DATE + 1</literal> is safer than 2278 <literal>'tomorrow'::date</literal>. 2279 </para> 2280 </caution> 2281 2282 </sect3> 2283 </sect2> 2284 2285 <sect2 id="datatype-datetime-output"> 2286 <title>Date/Time Output</title> 2287 2288 <indexterm> 2289 <primary>date</primary> 2290 <secondary>output format</secondary> 2291 <seealso>formatting</seealso> 2292 </indexterm> 2293 2294 <indexterm> 2295 <primary>time</primary> 2296 <secondary>output format</secondary> 2297 <seealso>formatting</seealso> 2298 </indexterm> 2299 2300 <para> 2301 The output format of the date/time types can be set to one of the four 2302 styles ISO 8601, 2303 <acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</productname> 2304 (Unix <application>date</application> format), or 2305 German. The default 2306 is the <acronym>ISO</acronym> format. (The 2307 <acronym>SQL</acronym> standard requires the use of the ISO 8601 2308 format. The name of the <quote>SQL</quote> output format is a 2309 historical accident.) <xref 2310 linkend="datatype-datetime-output-table"/> shows examples of each 2311 output style. The output of the <type>date</type> and 2312 <type>time</type> types is generally only the date or time part 2313 in accordance with the given examples. However, the 2314 <productname>POSTGRES</productname> style outputs date-only values in 2315 <acronym>ISO</acronym> format. 2316 </para> 2317 2318 <table id="datatype-datetime-output-table"> 2319 <title>Date/Time Output Styles</title> 2320 <tgroup cols="3"> 2321 <colspec colname="col1" colwidth="1*"/> 2322 <colspec colname="col2" colwidth="1*"/> 2323 <colspec colname="col3" colwidth="2*"/> 2324 <thead> 2325 <row> 2326 <entry>Style Specification</entry> 2327 <entry>Description</entry> 2328 <entry>Example</entry> 2329 </row> 2330 </thead> 2331 <tbody> 2332 <row> 2333 <entry><literal>ISO</literal></entry> 2334 <entry>ISO 8601, SQL standard</entry> 2335 <entry><literal>1997-12-17 07:37:16-08</literal></entry> 2336 </row> 2337 <row> 2338 <entry><literal>SQL</literal></entry> 2339 <entry>traditional style</entry> 2340 <entry><literal>12/17/1997 07:37:16.00 PST</literal></entry> 2341 </row> 2342 <row> 2343 <entry><literal>Postgres</literal></entry> 2344 <entry>original style</entry> 2345 <entry><literal>Wed Dec 17 07:37:16 1997 PST</literal></entry> 2346 </row> 2347 <row> 2348 <entry><literal>German</literal></entry> 2349 <entry>regional style</entry> 2350 <entry><literal>17.12.1997 07:37:16.00 PST</literal></entry> 2351 </row> 2352 </tbody> 2353 </tgroup> 2354 </table> 2355 2356 <note> 2357 <para> 2358 ISO 8601 specifies the use of uppercase letter <literal>T</literal> to separate 2359 the date and time. <productname>PostgreSQL</productname> accepts that format on 2360 input, but on output it uses a space rather than <literal>T</literal>, as shown 2361 above. This is for readability and for consistency with RFC 3339 as 2362 well as some other database systems. 2363 </para> 2364 </note> 2365 2366 <para> 2367 In the <acronym>SQL</acronym> and POSTGRES styles, day appears before 2368 month if DMY field ordering has been specified, otherwise month appears 2369 before day. 2370 (See <xref linkend="datatype-datetime-input"/> 2371 for how this setting also affects interpretation of input values.) 2372 <xref linkend="datatype-datetime-output2-table"/> shows examples. 2373 </para> 2374 2375 <table id="datatype-datetime-output2-table"> 2376 <title>Date Order Conventions</title> 2377 <tgroup cols="3"> 2378 <colspec colname="col1" colwidth="1*"/> 2379 <colspec colname="col2" colwidth="1*"/> 2380 <colspec colname="col3" colwidth="2*"/> 2381 <thead> 2382 <row> 2383 <entry><varname>datestyle</varname> Setting</entry> 2384 <entry>Input Ordering</entry> 2385 <entry>Example Output</entry> 2386 </row> 2387 </thead> 2388 <tbody> 2389 <row> 2390 <entry><literal>SQL, DMY</literal></entry> 2391 <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry> 2392 <entry><literal>17/12/1997 15:37:16.00 CET</literal></entry> 2393 </row> 2394 <row> 2395 <entry><literal>SQL, MDY</literal></entry> 2396 <entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry> 2397 <entry><literal>12/17/1997 07:37:16.00 PST</literal></entry> 2398 </row> 2399 <row> 2400 <entry><literal>Postgres, DMY</literal></entry> 2401 <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry> 2402 <entry><literal>Wed 17 Dec 07:37:16 1997 PST</literal></entry> 2403 </row> 2404 </tbody> 2405 </tgroup> 2406 </table> 2407 2408 <para> 2409 In the <acronym>ISO</acronym> style, the time zone is always shown as 2410 a signed numeric offset from UTC, with positive sign used for zones 2411 east of Greenwich. The offset will be shown 2412 as <replaceable>hh</replaceable> (hours only) if it is an integral 2413 number of hours, else 2414 as <replaceable>hh</replaceable>:<replaceable>mm</replaceable> if it 2415 is an integral number of minutes, else as 2416 <replaceable>hh</replaceable>:<replaceable>mm</replaceable>:<replaceable>ss</replaceable>. 2417 (The third case is not possible with any modern time zone standard, 2418 but it can appear when working with timestamps that predate the 2419 adoption of standardized time zones.) 2420 In the other date styles, the time zone is shown as an alphabetic 2421 abbreviation if one is in common use in the current zone. Otherwise 2422 it appears as a signed numeric offset in ISO 8601 basic format 2423 (<replaceable>hh</replaceable> or <replaceable>hhmm</replaceable>). 2424 </para> 2425 2426 <para> 2427 The date/time style can be selected by the user using the 2428 <command>SET datestyle</command> command, the <xref 2429 linkend="guc-datestyle"/> parameter in the 2430 <filename>postgresql.conf</filename> configuration file, or the 2431 <envar>PGDATESTYLE</envar> environment variable on the server or 2432 client. 2433 </para> 2434 2435 <para> 2436 The formatting function <function>to_char</function> 2437 (see <xref linkend="functions-formatting"/>) is also available as 2438 a more flexible way to format date/time output. 2439 </para> 2440 </sect2> 2441 2442 <sect2 id="datatype-timezones"> 2443 <title>Time Zones</title> 2444 2445 <indexterm zone="datatype-timezones"> 2446 <primary>time zone</primary> 2447 </indexterm> 2448 2449 <para> 2450 Time zones, and time-zone conventions, are influenced by 2451 political decisions, not just earth geometry. Time zones around the 2452 world became somewhat standardized during the 1900s, 2453 but continue to be prone to arbitrary changes, particularly with 2454 respect to daylight-savings rules. 2455 <productname>PostgreSQL</productname> uses the widely-used 2456 IANA (Olson) time zone database for information about 2457 historical time zone rules. For times in the future, the assumption 2458 is that the latest known rules for a given time zone will 2459 continue to be observed indefinitely far into the future. 2460 </para> 2461 2462 <para> 2463 <productname>PostgreSQL</productname> endeavors to be compatible with 2464 the <acronym>SQL</acronym> standard definitions for typical usage. 2465 However, the <acronym>SQL</acronym> standard has an odd mix of date and 2466 time types and capabilities. Two obvious problems are: 2467 2468 <itemizedlist> 2469 <listitem> 2470 <para> 2471 Although the <type>date</type> type 2472 cannot have an associated time zone, the 2473 <type>time</type> type can. 2474 Time zones in the real world have little meaning unless 2475 associated with a date as well as a time, 2476 since the offset can vary through the year with daylight-saving 2477 time boundaries. 2478 </para> 2479 </listitem> 2480 2481 <listitem> 2482 <para> 2483 The default time zone is specified as a constant numeric offset 2484 from <acronym>UTC</acronym>. It is therefore impossible to adapt to 2485 daylight-saving time when doing date/time arithmetic across 2486 <acronym>DST</acronym> boundaries. 2487 </para> 2488 </listitem> 2489 2490 </itemizedlist> 2491 </para> 2492 2493 <para> 2494 To address these difficulties, we recommend using date/time types 2495 that contain both date and time when using time zones. We 2496 do <emphasis>not</emphasis> recommend using the type <type>time with 2497 time zone</type> (though it is supported by 2498 <productname>PostgreSQL</productname> for legacy applications and 2499 for compliance with the <acronym>SQL</acronym> standard). 2500 <productname>PostgreSQL</productname> assumes 2501 your local time zone for any type containing only date or time. 2502 </para> 2503 2504 <para> 2505 All timezone-aware dates and times are stored internally in 2506 <acronym>UTC</acronym>. They are converted to local time 2507 in the zone specified by the <xref linkend="guc-timezone"/> configuration 2508 parameter before being displayed to the client. 2509 </para> 2510 2511 <para> 2512 <productname>PostgreSQL</productname> allows you to specify time zones in 2513 three different forms: 2514 <itemizedlist> 2515 <listitem> 2516 <para> 2517 A full time zone name, for example <literal>America/New_York</literal>. 2518 The recognized time zone names are listed in the 2519 <literal>pg_timezone_names</literal> view (see <xref 2520 linkend="view-pg-timezone-names"/>). 2521 <productname>PostgreSQL</productname> uses the widely-used IANA 2522 time zone data for this purpose, so the same time zone 2523 names are also recognized by other software. 2524 </para> 2525 </listitem> 2526 <listitem> 2527 <para> 2528 A time zone abbreviation, for example <literal>PST</literal>. Such a 2529 specification merely defines a particular offset from UTC, in 2530 contrast to full time zone names which can imply a set of daylight 2531 savings transition rules as well. The recognized abbreviations 2532 are listed in the <literal>pg_timezone_abbrevs</literal> view (see <xref 2533 linkend="view-pg-timezone-abbrevs"/>). You cannot set the 2534 configuration parameters <xref linkend="guc-timezone"/> or 2535 <xref linkend="guc-log-timezone"/> to a time 2536 zone abbreviation, but you can use abbreviations in 2537 date/time input values and with the <literal>AT TIME ZONE</literal> 2538 operator. 2539 </para> 2540 </listitem> 2541 <listitem> 2542 <para> 2543 In addition to the timezone names and abbreviations, 2544 <productname>PostgreSQL</productname> will accept POSIX-style time zone 2545 specifications, as described in 2546 <xref linkend="datetime-posix-timezone-specs"/>. This option is not 2547 normally preferable to using a named time zone, but it may be 2548 necessary if no suitable IANA time zone entry is available. 2549 </para> 2550 </listitem> 2551 </itemizedlist> 2552 2553 In short, this is the difference between abbreviations 2554 and full names: abbreviations represent a specific offset from UTC, 2555 whereas many of the full names imply a local daylight-savings time 2556 rule, and so have two possible UTC offsets. As an example, 2557 <literal>2014-06-04 12:00 America/New_York</literal> represents noon local 2558 time in New York, which for this particular date was Eastern Daylight 2559 Time (UTC-4). So <literal>2014-06-04 12:00 EDT</literal> specifies that 2560 same time instant. But <literal>2014-06-04 12:00 EST</literal> specifies 2561 noon Eastern Standard Time (UTC-5), regardless of whether daylight 2562 savings was nominally in effect on that date. 2563 </para> 2564 2565 <para> 2566 To complicate matters, some jurisdictions have used the same timezone 2567 abbreviation to mean different UTC offsets at different times; for 2568 example, in Moscow <literal>MSK</literal> has meant UTC+3 in some years and 2569 UTC+4 in others. <application>PostgreSQL</application> interprets such 2570 abbreviations according to whatever they meant (or had most recently 2571 meant) on the specified date; but, as with the <literal>EST</literal> example 2572 above, this is not necessarily the same as local civil time on that date. 2573 </para> 2574 2575 <para> 2576 In all cases, timezone names and abbreviations are recognized 2577 case-insensitively. (This is a change from <productname>PostgreSQL</productname> 2578 versions prior to 8.2, which were case-sensitive in some contexts but 2579 not others.) 2580 </para> 2581 2582 <para> 2583 Neither timezone names nor abbreviations are hard-wired into the server; 2584 they are obtained from configuration files stored under 2585 <filename>.../share/timezone/</filename> and <filename>.../share/timezonesets/</filename> 2586 of the installation directory 2587 (see <xref linkend="datetime-config-files"/>). 2588 </para> 2589 2590 <para> 2591 The <xref linkend="guc-timezone"/> configuration parameter can 2592 be set in the file <filename>postgresql.conf</filename>, or in any of the 2593 other standard ways described in <xref linkend="runtime-config"/>. 2594 There are also some special ways to set it: 2595 2596 <itemizedlist> 2597 <listitem> 2598 <para> 2599 The <acronym>SQL</acronym> command <command>SET TIME ZONE</command> 2600 sets the time zone for the session. This is an alternative spelling 2601 of <command>SET TIMEZONE TO</command> with a more SQL-spec-compatible syntax. 2602 </para> 2603 </listitem> 2604 2605 <listitem> 2606 <para> 2607 The <envar>PGTZ</envar> environment variable is used by 2608 <application>libpq</application> clients 2609 to send a <command>SET TIME ZONE</command> 2610 command to the server upon connection. 2611 </para> 2612 </listitem> 2613 </itemizedlist> 2614 </para> 2615 </sect2> 2616 2617 <sect2 id="datatype-interval-input"> 2618 <title>Interval Input</title> 2619 2620 <indexterm> 2621 <primary>interval</primary> 2622 </indexterm> 2623 2624 <para> 2625 <type>interval</type> values can be written using the following 2626 verbose syntax: 2627 2628<synopsis> 2629<optional>@</optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> <optional><replaceable>quantity</replaceable> <replaceable>unit</replaceable>...</optional> <optional><replaceable>direction</replaceable></optional> 2630</synopsis> 2631 2632 where <replaceable>quantity</replaceable> is a number (possibly signed); 2633 <replaceable>unit</replaceable> is <literal>microsecond</literal>, 2634 <literal>millisecond</literal>, <literal>second</literal>, 2635 <literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>, 2636 <literal>week</literal>, <literal>month</literal>, <literal>year</literal>, 2637 <literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>, 2638 or abbreviations or plurals of these units; 2639 <replaceable>direction</replaceable> can be <literal>ago</literal> or 2640 empty. The at sign (<literal>@</literal>) is optional noise. The amounts 2641 of the different units are implicitly added with appropriate 2642 sign accounting. <literal>ago</literal> negates all the fields. 2643 This syntax is also used for interval output, if 2644 <xref linkend="guc-intervalstyle"/> is set to 2645 <literal>postgres_verbose</literal>. 2646 </para> 2647 2648 <para> 2649 Quantities of days, hours, minutes, and seconds can be specified without 2650 explicit unit markings. For example, <literal>'1 12:59:10'</literal> is read 2651 the same as <literal>'1 day 12 hours 59 min 10 sec'</literal>. Also, 2652 a combination of years and months can be specified with a dash; 2653 for example <literal>'200-10'</literal> is read the same as <literal>'200 years 2654 10 months'</literal>. (These shorter forms are in fact the only ones allowed 2655 by the <acronym>SQL</acronym> standard, and are used for output when 2656 <varname>IntervalStyle</varname> is set to <literal>sql_standard</literal>.) 2657 </para> 2658 2659 <para> 2660 Interval values can also be written as ISO 8601 time intervals, using 2661 either the <quote>format with designators</quote> of the standard's section 2662 4.4.3.2 or the <quote>alternative format</quote> of section 4.4.3.3. The 2663 format with designators looks like this: 2664<synopsis> 2665P <replaceable>quantity</replaceable> <replaceable>unit</replaceable> <optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> ...</optional> <optional> T <optional> <replaceable>quantity</replaceable> <replaceable>unit</replaceable> ...</optional></optional> 2666</synopsis> 2667 The string must start with a <literal>P</literal>, and may include a 2668 <literal>T</literal> that introduces the time-of-day units. The 2669 available unit abbreviations are given in <xref 2670 linkend="datatype-interval-iso8601-units"/>. Units may be 2671 omitted, and may be specified in any order, but units smaller than 2672 a day must appear after <literal>T</literal>. In particular, the meaning of 2673 <literal>M</literal> depends on whether it is before or after 2674 <literal>T</literal>. 2675 </para> 2676 2677 <table id="datatype-interval-iso8601-units"> 2678 <title>ISO 8601 Interval Unit Abbreviations</title> 2679 <tgroup cols="2"> 2680 <thead> 2681 <row> 2682 <entry>Abbreviation</entry> 2683 <entry>Meaning</entry> 2684 </row> 2685 </thead> 2686 <tbody> 2687 <row> 2688 <entry>Y</entry> 2689 <entry>Years</entry> 2690 </row> 2691 <row> 2692 <entry>M</entry> 2693 <entry>Months (in the date part)</entry> 2694 </row> 2695 <row> 2696 <entry>W</entry> 2697 <entry>Weeks</entry> 2698 </row> 2699 <row> 2700 <entry>D</entry> 2701 <entry>Days</entry> 2702 </row> 2703 <row> 2704 <entry>H</entry> 2705 <entry>Hours</entry> 2706 </row> 2707 <row> 2708 <entry>M</entry> 2709 <entry>Minutes (in the time part)</entry> 2710 </row> 2711 <row> 2712 <entry>S</entry> 2713 <entry>Seconds</entry> 2714 </row> 2715 </tbody> 2716 </tgroup> 2717 </table> 2718 2719 <para> 2720 In the alternative format: 2721<synopsis> 2722P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-<replaceable>days</replaceable> </optional> <optional> T <replaceable>hours</replaceable>:<replaceable>minutes</replaceable>:<replaceable>seconds</replaceable> </optional> 2723</synopsis> 2724 the string must begin with <literal>P</literal>, and a 2725 <literal>T</literal> separates the date and time parts of the interval. 2726 The values are given as numbers similar to ISO 8601 dates. 2727 </para> 2728 2729 <para> 2730 When writing an interval constant with a <replaceable>fields</replaceable> 2731 specification, or when assigning a string to an interval column that was 2732 defined with a <replaceable>fields</replaceable> specification, the interpretation of 2733 unmarked quantities depends on the <replaceable>fields</replaceable>. For 2734 example <literal>INTERVAL '1' YEAR</literal> is read as 1 year, whereas 2735 <literal>INTERVAL '1'</literal> means 1 second. Also, field values 2736 <quote>to the right</quote> of the least significant field allowed by the 2737 <replaceable>fields</replaceable> specification are silently discarded. For 2738 example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</literal> 2739 results in dropping the seconds field, but not the day field. 2740 </para> 2741 2742 <para> 2743 According to the <acronym>SQL</acronym> standard all fields of an interval 2744 value must have the same sign, so a leading negative sign applies to all 2745 fields; for example the negative sign in the interval literal 2746 <literal>'-1 2:03:04'</literal> applies to both the days and hour/minute/second 2747 parts. <productname>PostgreSQL</productname> allows the fields to have different 2748 signs, and traditionally treats each field in the textual representation 2749 as independently signed, so that the hour/minute/second part is 2750 considered positive in this example. If <varname>IntervalStyle</varname> is 2751 set to <literal>sql_standard</literal> then a leading sign is considered 2752 to apply to all fields (but only if no additional signs appear). 2753 Otherwise the traditional <productname>PostgreSQL</productname> interpretation is 2754 used. To avoid ambiguity, it's recommended to attach an explicit sign 2755 to each field if any field is negative. 2756 </para> 2757 2758 <para> 2759 Field values can have fractional parts: for example, <literal>'1.5 2760 weeks'</literal> or <literal>'01:02:03.45'</literal>. However, 2761 because interval internally stores only three integer units (months, 2762 days, microseconds), fractional units must be spilled to smaller 2763 units. Fractional parts of units greater than months are truncated to 2764 be an integer number of months, e.g. <literal>'1.5 years'</literal> 2765 becomes <literal>'1 year 6 mons'</literal>. Fractional parts of 2766 weeks and days are computed to be an integer number of days and 2767 microseconds, assuming 30 days per month and 24 hours per day, e.g., 2768 <literal>'1.75 months'</literal> becomes <literal>1 mon 22 days 2769 12:00:00</literal>. Only seconds will ever be shown as fractional 2770 on output. 2771 </para> 2772 2773 <para> 2774 <xref linkend="datatype-interval-input-examples"/> shows some examples 2775 of valid <type>interval</type> input. 2776 </para> 2777 2778 <table id="datatype-interval-input-examples"> 2779 <title>Interval Input</title> 2780 <tgroup cols="2"> 2781 <thead> 2782 <row> 2783 <entry>Example</entry> 2784 <entry>Description</entry> 2785 </row> 2786 </thead> 2787 <tbody> 2788 <row> 2789 <entry><literal>1-2</literal></entry> 2790 <entry>SQL standard format: 1 year 2 months</entry> 2791 </row> 2792 <row> 2793 <entry><literal>3 4:05:06</literal></entry> 2794 <entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry> 2795 </row> 2796 <row> 2797 <entry><literal>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</literal></entry> 2798 <entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry> 2799 </row> 2800 <row> 2801 <entry><literal>P1Y2M3DT4H5M6S</literal></entry> 2802 <entry>ISO 8601 <quote>format with designators</quote>: same meaning as above</entry> 2803 </row> 2804 <row> 2805 <entry><literal>P0001-02-03T04:05:06</literal></entry> 2806 <entry>ISO 8601 <quote>alternative format</quote>: same meaning as above</entry> 2807 </row> 2808 </tbody> 2809 </tgroup> 2810 </table> 2811 2812 <para> 2813 Internally <type>interval</type> values are stored as months, days, 2814 and microseconds. This is done because the number of days in a month 2815 varies, and a day can have 23 or 25 hours if a daylight savings 2816 time adjustment is involved. The months and days fields are integers 2817 while the microseconds field can store fractional seconds. Because intervals are 2818 usually created from constant strings or <type>timestamp</type> subtraction, 2819 this storage method works well in most cases, but can cause unexpected 2820 results: 2821 2822<programlisting> 2823SELECT EXTRACT(hours from '80 minutes'::interval); 2824 date_part 2825----------- 2826 1 2827 2828SELECT EXTRACT(days from '80 hours'::interval); 2829 date_part 2830----------- 2831 0 2832</programlisting> 2833 2834 Functions <function>justify_days</function> and 2835 <function>justify_hours</function> are available for adjusting days 2836 and hours that overflow their normal ranges. 2837 </para> 2838 2839 </sect2> 2840 2841 <sect2 id="datatype-interval-output"> 2842 <title>Interval Output</title> 2843 2844 <indexterm> 2845 <primary>interval</primary> 2846 <secondary>output format</secondary> 2847 <seealso>formatting</seealso> 2848 </indexterm> 2849 2850 <para> 2851 The output format of the interval type can be set to one of the 2852 four styles <literal>sql_standard</literal>, <literal>postgres</literal>, 2853 <literal>postgres_verbose</literal>, or <literal>iso_8601</literal>, 2854 using the command <literal>SET intervalstyle</literal>. 2855 The default is the <literal>postgres</literal> format. 2856 <xref linkend="interval-style-output-table"/> shows examples of each 2857 output style. 2858 </para> 2859 2860 <para> 2861 The <literal>sql_standard</literal> style produces output that conforms to 2862 the SQL standard's specification for interval literal strings, if 2863 the interval value meets the standard's restrictions (either year-month 2864 only or day-time only, with no mixing of positive 2865 and negative components). Otherwise the output looks like a standard 2866 year-month literal string followed by a day-time literal string, 2867 with explicit signs added to disambiguate mixed-sign intervals. 2868 </para> 2869 2870 <para> 2871 The output of the <literal>postgres</literal> style matches the output of 2872 <productname>PostgreSQL</productname> releases prior to 8.4 when the 2873 <xref linkend="guc-datestyle"/> parameter was set to <literal>ISO</literal>. 2874 </para> 2875 2876 <para> 2877 The output of the <literal>postgres_verbose</literal> style matches the output of 2878 <productname>PostgreSQL</productname> releases prior to 8.4 when the 2879 <varname>DateStyle</varname> parameter was set to non-<literal>ISO</literal> output. 2880 </para> 2881 2882 <para> 2883 The output of the <literal>iso_8601</literal> style matches the <quote>format 2884 with designators</quote> described in section 4.4.3.2 of the 2885 ISO 8601 standard. 2886 </para> 2887 2888 <table id="interval-style-output-table"> 2889 <title>Interval Output Style Examples</title> 2890 <tgroup cols="4"> 2891 <thead> 2892 <row> 2893 <entry>Style Specification</entry> 2894 <entry>Year-Month Interval</entry> 2895 <entry>Day-Time Interval</entry> 2896 <entry>Mixed Interval</entry> 2897 </row> 2898 </thead> 2899 <tbody> 2900 <row> 2901 <entry><literal>sql_standard</literal></entry> 2902 <entry>1-2</entry> 2903 <entry>3 4:05:06</entry> 2904 <entry>-1-2 +3 -4:05:06</entry> 2905 </row> 2906 <row> 2907 <entry><literal>postgres</literal></entry> 2908 <entry>1 year 2 mons</entry> 2909 <entry>3 days 04:05:06</entry> 2910 <entry>-1 year -2 mons +3 days -04:05:06</entry> 2911 </row> 2912 <row> 2913 <entry><literal>postgres_verbose</literal></entry> 2914 <entry>@ 1 year 2 mons</entry> 2915 <entry>@ 3 days 4 hours 5 mins 6 secs</entry> 2916 <entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry> 2917 </row> 2918 <row> 2919 <entry><literal>iso_8601</literal></entry> 2920 <entry>P1Y2M</entry> 2921 <entry>P3DT4H5M6S</entry> 2922 <entry>P-1Y-2M3D&zwsp;T-4H-5M-6S</entry> 2923 </row> 2924 </tbody> 2925 </tgroup> 2926 </table> 2927 2928 </sect2> 2929 2930 </sect1> 2931 2932 <sect1 id="datatype-boolean"> 2933 <title>Boolean Type</title> 2934 2935 <indexterm zone="datatype-boolean"> 2936 <primary>Boolean</primary> 2937 <secondary>data type</secondary> 2938 </indexterm> 2939 2940 <indexterm zone="datatype-boolean"> 2941 <primary>true</primary> 2942 </indexterm> 2943 2944 <indexterm zone="datatype-boolean"> 2945 <primary>false</primary> 2946 </indexterm> 2947 2948 <para> 2949 <productname>PostgreSQL</productname> provides the 2950 standard <acronym>SQL</acronym> type <type>boolean</type>; 2951 see <xref linkend="datatype-boolean-table"/>. 2952 The <type>boolean</type> type can have several states: 2953 <quote>true</quote>, <quote>false</quote>, and a third state, 2954 <quote>unknown</quote>, which is represented by the 2955 <acronym>SQL</acronym> null value. 2956 </para> 2957 2958 <table id="datatype-boolean-table"> 2959 <title>Boolean Data Type</title> 2960 <tgroup cols="3"> 2961 <thead> 2962 <row> 2963 <entry>Name</entry> 2964 <entry>Storage Size</entry> 2965 <entry>Description</entry> 2966 </row> 2967 </thead> 2968 <tbody> 2969 <row> 2970 <entry><type>boolean</type></entry> 2971 <entry>1 byte</entry> 2972 <entry>state of true or false</entry> 2973 </row> 2974 </tbody> 2975 </tgroup> 2976 </table> 2977 2978 <para> 2979 Boolean constants can be represented in SQL queries by the SQL 2980 key words <literal>TRUE</literal>, <literal>FALSE</literal>, 2981 and <literal>NULL</literal>. 2982 </para> 2983 2984 <para> 2985 The datatype input function for type <type>boolean</type> accepts these 2986 string representations for the <quote>true</quote> state: 2987 <simplelist> 2988 <member><literal>true</literal></member> 2989 <member><literal>yes</literal></member> 2990 <member><literal>on</literal></member> 2991 <member><literal>1</literal></member> 2992 </simplelist> 2993 and these representations for the <quote>false</quote> state: 2994 <simplelist> 2995 <member><literal>false</literal></member> 2996 <member><literal>no</literal></member> 2997 <member><literal>off</literal></member> 2998 <member><literal>0</literal></member> 2999 </simplelist> 3000 Unique prefixes of these strings are also accepted, for 3001 example <literal>t</literal> or <literal>n</literal>. 3002 Leading or trailing whitespace is ignored, and case does not matter. 3003 </para> 3004 3005 <para> 3006 The datatype output function for type <type>boolean</type> always emits 3007 either <literal>t</literal> or <literal>f</literal>, as shown in 3008 <xref linkend="datatype-boolean-example"/>. 3009 </para> 3010 3011 <example id="datatype-boolean-example"> 3012 <title>Using the <type>boolean</type> Type</title> 3013 3014<programlisting> 3015CREATE TABLE test1 (a boolean, b text); 3016INSERT INTO test1 VALUES (TRUE, 'sic est'); 3017INSERT INTO test1 VALUES (FALSE, 'non est'); 3018SELECT * FROM test1; 3019 a | b 3020---+--------- 3021 t | sic est 3022 f | non est 3023 3024SELECT * FROM test1 WHERE a; 3025 a | b 3026---+--------- 3027 t | sic est 3028</programlisting> 3029 </example> 3030 3031 <para> 3032 The key words <literal>TRUE</literal> and <literal>FALSE</literal> are 3033 the preferred (<acronym>SQL</acronym>-compliant) method for writing 3034 Boolean constants in SQL queries. But you can also use the string 3035 representations by following the generic string-literal constant syntax 3036 described in <xref linkend="sql-syntax-constants-generic"/>, for 3037 example <literal>'yes'::boolean</literal>. 3038 </para> 3039 3040 <para> 3041 Note that the parser automatically understands 3042 that <literal>TRUE</literal> and <literal>FALSE</literal> are of 3043 type <type>boolean</type>, but this is not so 3044 for <literal>NULL</literal> because that can have any type. 3045 So in some contexts you might have to cast <literal>NULL</literal> 3046 to <type>boolean</type> explicitly, for 3047 example <literal>NULL::boolean</literal>. Conversely, the cast can be 3048 omitted from a string-literal Boolean value in contexts where the parser 3049 can deduce that the literal must be of type <type>boolean</type>. 3050 </para> 3051 </sect1> 3052 3053 <sect1 id="datatype-enum"> 3054 <title>Enumerated Types</title> 3055 3056 <indexterm zone="datatype-enum"> 3057 <primary>data type</primary> 3058 <secondary>enumerated (enum)</secondary> 3059 </indexterm> 3060 3061 <indexterm zone="datatype-enum"> 3062 <primary>enumerated types</primary> 3063 </indexterm> 3064 3065 <para> 3066 Enumerated (enum) types are data types that 3067 comprise a static, ordered set of values. 3068 They are equivalent to the <type>enum</type> 3069 types supported in a number of programming languages. An example of an enum 3070 type might be the days of the week, or a set of status values for 3071 a piece of data. 3072 </para> 3073 3074 <sect2> 3075 <title>Declaration of Enumerated Types</title> 3076 3077 <para> 3078 Enum types are created using the <xref 3079 linkend="sql-createtype"/> command, 3080 for example: 3081 3082<programlisting> 3083CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy'); 3084</programlisting> 3085 3086 Once created, the enum type can be used in table and function 3087 definitions much like any other type: 3088<programlisting> 3089CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy'); 3090CREATE TABLE person ( 3091 name text, 3092 current_mood mood 3093); 3094INSERT INTO person VALUES ('Moe', 'happy'); 3095SELECT * FROM person WHERE current_mood = 'happy'; 3096 name | current_mood 3097------+-------------- 3098 Moe | happy 3099(1 row) 3100</programlisting> 3101 </para> 3102 </sect2> 3103 3104 <sect2> 3105 <title>Ordering</title> 3106 3107 <para> 3108 The ordering of the values in an enum type is the 3109 order in which the values were listed when the type was created. 3110 All standard comparison operators and related 3111 aggregate functions are supported for enums. For example: 3112 3113<programlisting> 3114INSERT INTO person VALUES ('Larry', 'sad'); 3115INSERT INTO person VALUES ('Curly', 'ok'); 3116SELECT * FROM person WHERE current_mood > 'sad'; 3117 name | current_mood 3118-------+-------------- 3119 Moe | happy 3120 Curly | ok 3121(2 rows) 3122 3123SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood; 3124 name | current_mood 3125-------+-------------- 3126 Curly | ok 3127 Moe | happy 3128(2 rows) 3129 3130SELECT name 3131FROM person 3132WHERE current_mood = (SELECT MIN(current_mood) FROM person); 3133 name 3134------- 3135 Larry 3136(1 row) 3137</programlisting> 3138 </para> 3139 </sect2> 3140 3141 <sect2> 3142 <title>Type Safety</title> 3143 3144 <para> 3145 Each enumerated data type is separate and cannot 3146 be compared with other enumerated types. See this example: 3147 3148<programlisting> 3149CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic'); 3150CREATE TABLE holidays ( 3151 num_weeks integer, 3152 happiness happiness 3153); 3154INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy'); 3155INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy'); 3156INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic'); 3157INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad'); 3158ERROR: invalid input value for enum happiness: "sad" 3159SELECT person.name, holidays.num_weeks FROM person, holidays 3160 WHERE person.current_mood = holidays.happiness; 3161ERROR: operator does not exist: mood = happiness 3162</programlisting> 3163 </para> 3164 3165 <para> 3166 If you really need to do something like that, you can either 3167 write a custom operator or add explicit casts to your query: 3168 3169<programlisting> 3170SELECT person.name, holidays.num_weeks FROM person, holidays 3171 WHERE person.current_mood::text = holidays.happiness::text; 3172 name | num_weeks 3173------+----------- 3174 Moe | 4 3175(1 row) 3176 3177</programlisting> 3178 </para> 3179 </sect2> 3180 3181 <sect2> 3182 <title>Implementation Details</title> 3183 3184 <para> 3185 Enum labels are case sensitive, so 3186 <type>'happy'</type> is not the same as <type>'HAPPY'</type>. 3187 White space in the labels is significant too. 3188 </para> 3189 3190 <para> 3191 Although enum types are primarily intended for static sets of values, 3192 there is support for adding new values to an existing enum type, and for 3193 renaming values (see <xref linkend="sql-altertype"/>). Existing values 3194 cannot be removed from an enum type, nor can the sort ordering of such 3195 values be changed, short of dropping and re-creating the enum type. 3196 </para> 3197 3198 <para> 3199 An enum value occupies four bytes on disk. The length of an enum 3200 value's textual label is limited by the <symbol>NAMEDATALEN</symbol> 3201 setting compiled into <productname>PostgreSQL</productname>; in standard 3202 builds this means at most 63 bytes. 3203 </para> 3204 3205 <para> 3206 The translations from internal enum values to textual labels are 3207 kept in the system catalog 3208 <link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>. 3209 Querying this catalog directly can be useful. 3210 </para> 3211 3212 </sect2> 3213 </sect1> 3214 3215 <sect1 id="datatype-geometric"> 3216 <title>Geometric Types</title> 3217 3218 <para> 3219 Geometric data types represent two-dimensional spatial 3220 objects. <xref linkend="datatype-geo-table"/> shows the geometric 3221 types available in <productname>PostgreSQL</productname>. 3222 </para> 3223 3224 <table id="datatype-geo-table"> 3225 <title>Geometric Types</title> 3226 <tgroup cols="4"> 3227 <colspec colname="col1" colwidth="1*"/> 3228 <colspec colname="col2" colwidth="1*"/> 3229 <colspec colname="col3" colwidth="2*"/> 3230 <colspec colname="col4" colwidth="1*"/> 3231 <thead> 3232 <row> 3233 <entry>Name</entry> 3234 <entry>Storage Size</entry> 3235 <entry>Description</entry> 3236 <entry>Representation</entry> 3237 </row> 3238 </thead> 3239 <tbody> 3240 <row> 3241 <entry><type>point</type></entry> 3242 <entry>16 bytes</entry> 3243 <entry>Point on a plane</entry> 3244 <entry>(x,y)</entry> 3245 </row> 3246 <row> 3247 <entry><type>line</type></entry> 3248 <entry>32 bytes</entry> 3249 <entry>Infinite line</entry> 3250 <entry>{A,B,C}</entry> 3251 </row> 3252 <row> 3253 <entry><type>lseg</type></entry> 3254 <entry>32 bytes</entry> 3255 <entry>Finite line segment</entry> 3256 <entry>((x1,y1),(x2,y2))</entry> 3257 </row> 3258 <row> 3259 <entry><type>box</type></entry> 3260 <entry>32 bytes</entry> 3261 <entry>Rectangular box</entry> 3262 <entry>((x1,y1),(x2,y2))</entry> 3263 </row> 3264 <row> 3265 <entry><type>path</type></entry> 3266 <entry>16+16n bytes</entry> 3267 <entry>Closed path (similar to polygon)</entry> 3268 <entry>((x1,y1),...)</entry> 3269 </row> 3270 <row> 3271 <entry><type>path</type></entry> 3272 <entry>16+16n bytes</entry> 3273 <entry>Open path</entry> 3274 <entry>[(x1,y1),...]</entry> 3275 </row> 3276 <row> 3277 <entry><type>polygon</type></entry> 3278 <entry>40+16n bytes</entry> 3279 <entry>Polygon (similar to closed path)</entry> 3280 <entry>((x1,y1),...)</entry> 3281 </row> 3282 <row> 3283 <entry><type>circle</type></entry> 3284 <entry>24 bytes</entry> 3285 <entry>Circle</entry> 3286 <entry><(x,y),r> (center point and radius)</entry> 3287 </row> 3288 </tbody> 3289 </tgroup> 3290 </table> 3291 3292 <para> 3293 A rich set of functions and operators is available to perform various geometric 3294 operations such as scaling, translation, rotation, and determining 3295 intersections. They are explained in <xref linkend="functions-geometry"/>. 3296 </para> 3297 3298 <sect2> 3299 <title>Points</title> 3300 3301 <indexterm> 3302 <primary>point</primary> 3303 </indexterm> 3304 3305 <para> 3306 Points are the fundamental two-dimensional building block for geometric 3307 types. Values of type <type>point</type> are specified using either of 3308 the following syntaxes: 3309 3310<synopsis> 3311( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) 3312 <replaceable>x</replaceable> , <replaceable>y</replaceable> 3313</synopsis> 3314 3315 where <replaceable>x</replaceable> and <replaceable>y</replaceable> are the respective 3316 coordinates, as floating-point numbers. 3317 </para> 3318 3319 <para> 3320 Points are output using the first syntax. 3321 </para> 3322 </sect2> 3323 3324 <sect2 id="datatype-line"> 3325 <title>Lines</title> 3326 3327 <indexterm> 3328 <primary>line</primary> 3329 </indexterm> 3330 3331 <para> 3332 Lines are represented by the linear 3333 equation <replaceable>A</replaceable>x + <replaceable>B</replaceable>y + <replaceable>C</replaceable> = 0, 3334 where <replaceable>A</replaceable> and <replaceable>B</replaceable> are not both zero. Values 3335 of type <type>line</type> are input and output in the following form: 3336<synopsis> 3337{ <replaceable>A</replaceable>, <replaceable>B</replaceable>, <replaceable>C</replaceable> } 3338</synopsis> 3339 3340 Alternatively, any of the following forms can be used for input: 3341 3342<synopsis> 3343[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ] 3344( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ) 3345 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) 3346 <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable> 3347</synopsis> 3348 3349 where 3350 <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal> 3351 and 3352 <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal> 3353 are two different points on the line. 3354 </para> 3355 </sect2> 3356 3357 <sect2 id="datatype-lseg"> 3358 <title>Line Segments</title> 3359 3360 <indexterm> 3361 <primary>lseg</primary> 3362 </indexterm> 3363 3364 <indexterm> 3365 <primary>line segment</primary> 3366 </indexterm> 3367 3368 <para> 3369 Line segments are represented by pairs of points that are the endpoints 3370 of the segment. Values of type <type>lseg</type> are specified using any 3371 of the following syntaxes: 3372 3373<synopsis> 3374[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ] 3375( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ) 3376 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) 3377 <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable> 3378</synopsis> 3379 3380 where 3381 <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal> 3382 and 3383 <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal> 3384 are the end points of the line segment. 3385 </para> 3386 3387 <para> 3388 Line segments are output using the first syntax. 3389 </para> 3390 </sect2> 3391 3392 <sect2> 3393 <title>Boxes</title> 3394 3395 <indexterm> 3396 <primary>box (data type)</primary> 3397 </indexterm> 3398 3399 <indexterm> 3400 <primary>rectangle</primary> 3401 </indexterm> 3402 3403 <para> 3404 Boxes are represented by pairs of points that are opposite 3405 corners of the box. 3406 Values of type <type>box</type> are specified using any of the following 3407 syntaxes: 3408 3409<synopsis> 3410( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ) 3411 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) 3412 <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable> 3413</synopsis> 3414 3415 where 3416 <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal> 3417 and 3418 <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal> 3419 are any two opposite corners of the box. 3420 </para> 3421 3422 <para> 3423 Boxes are output using the second syntax. 3424 </para> 3425 3426 <para> 3427 Any two opposite corners can be supplied on input, but the values 3428 will be reordered as needed to store the 3429 upper right and lower left corners, in that order. 3430 </para> 3431 </sect2> 3432 3433 <sect2> 3434 <title>Paths</title> 3435 3436 <indexterm> 3437 <primary>path (data type)</primary> 3438 </indexterm> 3439 3440 <para> 3441 Paths are represented by lists of connected points. Paths can be 3442 <firstterm>open</firstterm>, where 3443 the first and last points in the list are considered not connected, or 3444 <firstterm>closed</firstterm>, 3445 where the first and last points are considered connected. 3446 </para> 3447 3448 <para> 3449 Values of type <type>path</type> are specified using any of the following 3450 syntaxes: 3451 3452<synopsis> 3453[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ] 3454( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ) 3455 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) 3456 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) 3457 <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> 3458</synopsis> 3459 3460 where the points are the end points of the line segments 3461 comprising the path. Square brackets (<literal>[]</literal>) indicate 3462 an open path, while parentheses (<literal>()</literal>) indicate a 3463 closed path. When the outermost parentheses are omitted, as 3464 in the third through fifth syntaxes, a closed path is assumed. 3465 </para> 3466 3467 <para> 3468 Paths are output using the first or second syntax, as appropriate. 3469 </para> 3470 </sect2> 3471 3472 <sect2 id="datatype-polygon"> 3473 <title>Polygons</title> 3474 3475 <indexterm> 3476 <primary>polygon</primary> 3477 </indexterm> 3478 3479 <para> 3480 Polygons are represented by lists of points (the vertexes of the 3481 polygon). Polygons are very similar to closed paths, but are 3482 stored differently and have their own set of support routines. 3483 </para> 3484 3485 <para> 3486 Values of type <type>polygon</type> are specified using any of the 3487 following syntaxes: 3488 3489<synopsis> 3490( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ) 3491 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) 3492 ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) 3493 <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> 3494</synopsis> 3495 3496 where the points are the end points of the line segments 3497 comprising the boundary of the polygon. 3498 </para> 3499 3500 <para> 3501 Polygons are output using the first syntax. 3502 </para> 3503 </sect2> 3504 3505 <sect2 id="datatype-circle"> 3506 <title>Circles</title> 3507 3508 <indexterm> 3509 <primary>circle</primary> 3510 </indexterm> 3511 3512 <para> 3513 Circles are represented by a center point and radius. 3514 Values of type <type>circle</type> are specified using any of the 3515 following syntaxes: 3516 3517<synopsis> 3518< ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> > 3519( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> ) 3520 ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> 3521 <replaceable>x</replaceable> , <replaceable>y</replaceable> , <replaceable>r</replaceable> 3522</synopsis> 3523 3524 where 3525 <literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal> 3526 is the center point and <replaceable>r</replaceable> is the radius of the 3527 circle. 3528 </para> 3529 3530 <para> 3531 Circles are output using the first syntax. 3532 </para> 3533 </sect2> 3534 3535 </sect1> 3536 3537 <sect1 id="datatype-net-types"> 3538 <title>Network Address Types</title> 3539 3540 <indexterm zone="datatype-net-types"> 3541 <primary>network</primary> 3542 <secondary>data types</secondary> 3543 </indexterm> 3544 3545 <para> 3546 <productname>PostgreSQL</productname> offers data types to store IPv4, IPv6, and MAC 3547 addresses, as shown in <xref linkend="datatype-net-types-table"/>. It 3548 is better to use these types instead of plain text types to store 3549 network addresses, because 3550 these types offer input error checking and specialized 3551 operators and functions (see <xref linkend="functions-net"/>). 3552 </para> 3553 3554 <table tocentry="1" id="datatype-net-types-table"> 3555 <title>Network Address Types</title> 3556 <tgroup cols="3"> 3557 <colspec colname="col1" colwidth="1*"/> 3558 <colspec colname="col2" colwidth="1*"/> 3559 <colspec colname="col3" colwidth="2*"/> 3560 <thead> 3561 <row> 3562 <entry>Name</entry> 3563 <entry>Storage Size</entry> 3564 <entry>Description</entry> 3565 </row> 3566 </thead> 3567 <tbody> 3568 3569 <row> 3570 <entry><type>cidr</type></entry> 3571 <entry>7 or 19 bytes</entry> 3572 <entry>IPv4 and IPv6 networks</entry> 3573 </row> 3574 3575 <row> 3576 <entry><type>inet</type></entry> 3577 <entry>7 or 19 bytes</entry> 3578 <entry>IPv4 and IPv6 hosts and networks</entry> 3579 </row> 3580 3581 <row> 3582 <entry><type>macaddr</type></entry> 3583 <entry>6 bytes</entry> 3584 <entry>MAC addresses</entry> 3585 </row> 3586 3587 <row> 3588 <entry><type>macaddr8</type></entry> 3589 <entry>8 bytes</entry> 3590 <entry>MAC addresses (EUI-64 format)</entry> 3591 </row> 3592 3593 </tbody> 3594 </tgroup> 3595 </table> 3596 3597 <para> 3598 When sorting <type>inet</type> or <type>cidr</type> data types, 3599 IPv4 addresses will always sort before IPv6 addresses, including 3600 IPv4 addresses encapsulated or mapped to IPv6 addresses, such as 3601 ::10.2.3.4 or ::ffff:10.4.3.2. 3602 </para> 3603 3604 3605 <sect2 id="datatype-inet"> 3606 <title><type>inet</type></title> 3607 3608 <indexterm> 3609 <primary>inet (data type)</primary> 3610 </indexterm> 3611 3612 <para> 3613 The <type>inet</type> type holds an IPv4 or IPv6 host address, and 3614 optionally its subnet, all in one field. 3615 The subnet is represented by the number of network address bits 3616 present in the host address (the 3617 <quote>netmask</quote>). If the netmask is 32 and the address is IPv4, 3618 then the value does not indicate a subnet, only a single host. 3619 In IPv6, the address length is 128 bits, so 128 bits specify a 3620 unique host address. Note that if you 3621 want to accept only networks, you should use the 3622 <type>cidr</type> type rather than <type>inet</type>. 3623 </para> 3624 3625 <para> 3626 The input format for this type is 3627 <replaceable class="parameter">address/y</replaceable> 3628 where 3629 <replaceable class="parameter">address</replaceable> 3630 is an IPv4 or IPv6 address and 3631 <replaceable class="parameter">y</replaceable> 3632 is the number of bits in the netmask. If the 3633 <replaceable class="parameter">/y</replaceable> 3634 portion is omitted, the 3635 netmask is taken to be 32 for IPv4 or 128 for IPv6, 3636 so the value represents 3637 just a single host. On display, the 3638 <replaceable class="parameter">/y</replaceable> 3639 portion is suppressed if the netmask specifies a single host. 3640 </para> 3641 </sect2> 3642 3643 <sect2 id="datatype-cidr"> 3644 <title><type>cidr</type></title> 3645 3646 <indexterm> 3647 <primary>cidr</primary> 3648 </indexterm> 3649 3650 <para> 3651 The <type>cidr</type> type holds an IPv4 or IPv6 network specification. 3652 Input and output formats follow Classless Internet Domain Routing 3653 conventions. 3654 The format for specifying networks is <replaceable 3655 class="parameter">address/y</replaceable> where <replaceable 3656 class="parameter">address</replaceable> is the network's lowest 3657 address represented as an 3658 IPv4 or IPv6 address, and <replaceable 3659 class="parameter">y</replaceable> is the number of bits in the netmask. If 3660 <replaceable class="parameter">y</replaceable> is omitted, it is calculated 3661 using assumptions from the older classful network numbering system, except 3662 it will be at least large enough to include all of the octets 3663 written in the input. It is an error to specify a network address 3664 that has bits set to the right of the specified netmask. 3665 </para> 3666 3667 <para> 3668 <xref linkend="datatype-net-cidr-table"/> shows some examples. 3669 </para> 3670 3671 <table id="datatype-net-cidr-table"> 3672 <title><type>cidr</type> Type Input Examples</title> 3673 <tgroup cols="3"> 3674 <thead> 3675 <row> 3676 <entry><type>cidr</type> Input</entry> 3677 <entry><type>cidr</type> Output</entry> 3678 <entry><literal><function>abbrev(<type>cidr</type>)</function></literal></entry> 3679 </row> 3680 </thead> 3681 <tbody> 3682 <row> 3683 <entry>192.168.100.128/25</entry> 3684 <entry>192.168.100.128/25</entry> 3685 <entry>192.168.100.128/25</entry> 3686 </row> 3687 <row> 3688 <entry>192.168/24</entry> 3689 <entry>192.168.0.0/24</entry> 3690 <entry>192.168.0/24</entry> 3691 </row> 3692 <row> 3693 <entry>192.168/25</entry> 3694 <entry>192.168.0.0/25</entry> 3695 <entry>192.168.0.0/25</entry> 3696 </row> 3697 <row> 3698 <entry>192.168.1</entry> 3699 <entry>192.168.1.0/24</entry> 3700 <entry>192.168.1/24</entry> 3701 </row> 3702 <row> 3703 <entry>192.168</entry> 3704 <entry>192.168.0.0/24</entry> 3705 <entry>192.168.0/24</entry> 3706 </row> 3707 <row> 3708 <entry>128.1</entry> 3709 <entry>128.1.0.0/16</entry> 3710 <entry>128.1/16</entry> 3711 </row> 3712 <row> 3713 <entry>128</entry> 3714 <entry>128.0.0.0/16</entry> 3715 <entry>128.0/16</entry> 3716 </row> 3717 <row> 3718 <entry>128.1.2</entry> 3719 <entry>128.1.2.0/24</entry> 3720 <entry>128.1.2/24</entry> 3721 </row> 3722 <row> 3723 <entry>10.1.2</entry> 3724 <entry>10.1.2.0/24</entry> 3725 <entry>10.1.2/24</entry> 3726 </row> 3727 <row> 3728 <entry>10.1</entry> 3729 <entry>10.1.0.0/16</entry> 3730 <entry>10.1/16</entry> 3731 </row> 3732 <row> 3733 <entry>10</entry> 3734 <entry>10.0.0.0/8</entry> 3735 <entry>10/8</entry> 3736 </row> 3737 <row> 3738 <entry>10.1.2.3/32</entry> 3739 <entry>10.1.2.3/32</entry> 3740 <entry>10.1.2.3/32</entry> 3741 </row> 3742 <row> 3743 <entry>2001:4f8:3:ba::/64</entry> 3744 <entry>2001:4f8:3:ba::/64</entry> 3745 <entry>2001:4f8:3:ba/64</entry> 3746 </row> 3747 <row> 3748 <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry> 3749 <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry> 3750 <entry>2001:4f8:3:ba:&zwsp;2e0:81ff:fe22:d1f1/128</entry> 3751 </row> 3752 <row> 3753 <entry>::ffff:1.2.3.0/120</entry> 3754 <entry>::ffff:1.2.3.0/120</entry> 3755 <entry>::ffff:1.2.3/120</entry> 3756 </row> 3757 <row> 3758 <entry>::ffff:1.2.3.0/128</entry> 3759 <entry>::ffff:1.2.3.0/128</entry> 3760 <entry>::ffff:1.2.3.0/128</entry> 3761 </row> 3762 </tbody> 3763 </tgroup> 3764 </table> 3765 </sect2> 3766 3767 <sect2 id="datatype-inet-vs-cidr"> 3768 <title><type>inet</type> vs. <type>cidr</type></title> 3769 3770 <para> 3771 The essential difference between <type>inet</type> and <type>cidr</type> 3772 data types is that <type>inet</type> accepts values with nonzero bits to 3773 the right of the netmask, whereas <type>cidr</type> does not. For 3774 example, <literal>192.168.0.1/24</literal> is valid for <type>inet</type> 3775 but not for <type>cidr</type>. 3776 </para> 3777 3778 <tip> 3779 <para> 3780 If you do not like the output format for <type>inet</type> or 3781 <type>cidr</type> values, try the functions <function>host</function>, 3782 <function>text</function>, and <function>abbrev</function>. 3783 </para> 3784 </tip> 3785 </sect2> 3786 3787 <sect2 id="datatype-macaddr"> 3788 <title><type>macaddr</type></title> 3789 3790 <indexterm> 3791 <primary>macaddr (data type)</primary> 3792 </indexterm> 3793 3794 <indexterm> 3795 <primary>MAC address</primary> 3796 <see>macaddr</see> 3797 </indexterm> 3798 3799 <para> 3800 The <type>macaddr</type> type stores MAC addresses, known for example 3801 from Ethernet card hardware addresses (although MAC addresses are 3802 used for other purposes as well). Input is accepted in the 3803 following formats: 3804 3805 <simplelist> 3806 <member><literal>'08:00:2b:01:02:03'</literal></member> 3807 <member><literal>'08-00-2b-01-02-03'</literal></member> 3808 <member><literal>'08002b:010203'</literal></member> 3809 <member><literal>'08002b-010203'</literal></member> 3810 <member><literal>'0800.2b01.0203'</literal></member> 3811 <member><literal>'0800-2b01-0203'</literal></member> 3812 <member><literal>'08002b010203'</literal></member> 3813 </simplelist> 3814 3815 These examples all specify the same address. Upper and 3816 lower case is accepted for the digits 3817 <literal>a</literal> through <literal>f</literal>. Output is always in the 3818 first of the forms shown. 3819 </para> 3820 3821 <para> 3822 IEEE Std 802-2001 specifies the second shown form (with hyphens) 3823 as the canonical form for MAC addresses, and specifies the first 3824 form (with colons) as the bit-reversed notation, so that 3825 08-00-2b-01-02-03 = 01:00:4D:08:04:0C. This convention is widely 3826 ignored nowadays, and it is relevant only for obsolete network 3827 protocols (such as Token Ring). PostgreSQL makes no provisions 3828 for bit reversal, and all accepted formats use the canonical LSB 3829 order. 3830 </para> 3831 3832 <para> 3833 The remaining five input formats are not part of any standard. 3834 </para> 3835 </sect2> 3836 3837 <sect2 id="datatype-macaddr8"> 3838 <title><type>macaddr8</type></title> 3839 3840 <indexterm> 3841 <primary>macaddr8 (data type)</primary> 3842 </indexterm> 3843 3844 <indexterm> 3845 <primary>MAC address (EUI-64 format)</primary> 3846 <see>macaddr</see> 3847 </indexterm> 3848 3849 <para> 3850 The <type>macaddr8</type> type stores MAC addresses in EUI-64 3851 format, known for example from Ethernet card hardware addresses 3852 (although MAC addresses are used for other purposes as well). 3853 This type can accept both 6 and 8 byte length MAC addresses 3854 and stores them in 8 byte length format. MAC addresses given 3855 in 6 byte format will be stored in 8 byte length format with the 3856 4th and 5th bytes set to FF and FE, respectively. 3857 3858 Note that IPv6 uses a modified EUI-64 format where the 7th bit 3859 should be set to one after the conversion from EUI-48. The 3860 function <function>macaddr8_set7bit</function> is provided to make this 3861 change. 3862 3863 Generally speaking, any input which is comprised of pairs of hex 3864 digits (on byte boundaries), optionally separated consistently by 3865 one of <literal>':'</literal>, <literal>'-'</literal> or <literal>'.'</literal>, is 3866 accepted. The number of hex digits must be either 16 (8 bytes) or 3867 12 (6 bytes). Leading and trailing whitespace is ignored. 3868 3869 The following are examples of input formats that are accepted: 3870 3871 <simplelist> 3872 <member><literal>'08:00:2b:01:02:03:04:05'</literal></member> 3873 <member><literal>'08-00-2b-01-02-03-04-05'</literal></member> 3874 <member><literal>'08002b:0102030405'</literal></member> 3875 <member><literal>'08002b-0102030405'</literal></member> 3876 <member><literal>'0800.2b01.0203.0405'</literal></member> 3877 <member><literal>'0800-2b01-0203-0405'</literal></member> 3878 <member><literal>'08002b01:02030405'</literal></member> 3879 <member><literal>'08002b0102030405'</literal></member> 3880 </simplelist> 3881 3882 These examples all specify the same address. Upper and 3883 lower case is accepted for the digits 3884 <literal>a</literal> through <literal>f</literal>. Output is always in the 3885 first of the forms shown. 3886 </para> 3887 3888 <para> 3889 The last six input formats shown above are not part of any standard. 3890 </para> 3891 3892 <para> 3893 To convert a traditional 48 bit MAC address in EUI-48 format to 3894 modified EUI-64 format to be included as the host portion of an 3895 IPv6 address, use <function>macaddr8_set7bit</function> as shown: 3896 3897<programlisting> 3898SELECT macaddr8_set7bit('08:00:2b:01:02:03'); 3899<computeroutput> 3900 macaddr8_set7bit 3901------------------------- 3902 0a:00:2b:ff:fe:01:02:03 3903(1 row) 3904</computeroutput> 3905</programlisting> 3906 3907 </para> 3908 3909 </sect2> 3910 3911 </sect1> 3912 3913 <sect1 id="datatype-bit"> 3914 <title>Bit String Types</title> 3915 3916 <indexterm zone="datatype-bit"> 3917 <primary>bit string</primary> 3918 <secondary>data type</secondary> 3919 </indexterm> 3920 3921 <para> 3922 Bit strings are strings of 1's and 0's. They can be used to store 3923 or visualize bit masks. There are two SQL bit types: 3924 <type>bit(<replaceable>n</replaceable>)</type> and <type>bit 3925 varying(<replaceable>n</replaceable>)</type>, where 3926 <replaceable>n</replaceable> is a positive integer. 3927 </para> 3928 3929 <para> 3930 <type>bit</type> type data must match the length 3931 <replaceable>n</replaceable> exactly; it is an error to attempt to 3932 store shorter or longer bit strings. <type>bit varying</type> data is 3933 of variable length up to the maximum length 3934 <replaceable>n</replaceable>; longer strings will be rejected. 3935 Writing <type>bit</type> without a length is equivalent to 3936 <literal>bit(1)</literal>, while <type>bit varying</type> without a length 3937 specification means unlimited length. 3938 </para> 3939 3940 <note> 3941 <para> 3942 If one explicitly casts a bit-string value to 3943 <type>bit(<replaceable>n</replaceable>)</type>, it will be truncated or 3944 zero-padded on the right to be exactly <replaceable>n</replaceable> bits, 3945 without raising an error. Similarly, 3946 if one explicitly casts a bit-string value to 3947 <type>bit varying(<replaceable>n</replaceable>)</type>, it will be truncated 3948 on the right if it is more than <replaceable>n</replaceable> bits. 3949 </para> 3950 </note> 3951 3952 <para> 3953 Refer to <xref 3954 linkend="sql-syntax-bit-strings"/> for information about the syntax 3955 of bit string constants. Bit-logical operators and string 3956 manipulation functions are available; see <xref 3957 linkend="functions-bitstring"/>. 3958 </para> 3959 3960 <example> 3961 <title>Using the Bit String Types</title> 3962 3963<programlisting> 3964CREATE TABLE test (a BIT(3), b BIT VARYING(5)); 3965INSERT INTO test VALUES (B'101', B'00'); 3966INSERT INTO test VALUES (B'10', B'101'); 3967<computeroutput> 3968ERROR: bit string length 2 does not match type bit(3) 3969</computeroutput> 3970INSERT INTO test VALUES (B'10'::bit(3), B'101'); 3971SELECT * FROM test; 3972<computeroutput> 3973 a | b 3974-----+----- 3975 101 | 00 3976 100 | 101 3977</computeroutput> 3978</programlisting> 3979 </example> 3980 3981 <para> 3982 A bit string value requires 1 byte for each group of 8 bits, plus 3983 5 or 8 bytes overhead depending on the length of the string 3984 (but long values may be compressed or moved out-of-line, as explained 3985 in <xref linkend="datatype-character"/> for character strings). 3986 </para> 3987 </sect1> 3988 3989 <sect1 id="datatype-textsearch"> 3990 <title>Text Search Types</title> 3991 3992 <indexterm zone="datatype-textsearch"> 3993 <primary>full text search</primary> 3994 <secondary>data types</secondary> 3995 </indexterm> 3996 3997 <indexterm zone="datatype-textsearch"> 3998 <primary>text search</primary> 3999 <secondary>data types</secondary> 4000 </indexterm> 4001 4002 <para> 4003 <productname>PostgreSQL</productname> provides two data types that 4004 are designed to support full text search, which is the activity of 4005 searching through a collection of natural-language <firstterm>documents</firstterm> 4006 to locate those that best match a <firstterm>query</firstterm>. 4007 The <type>tsvector</type> type represents a document in a form optimized 4008 for text search; the <type>tsquery</type> type similarly represents 4009 a text query. 4010 <xref linkend="textsearch"/> provides a detailed explanation of this 4011 facility, and <xref linkend="functions-textsearch"/> summarizes the 4012 related functions and operators. 4013 </para> 4014 4015 <sect2 id="datatype-tsvector"> 4016 <title><type>tsvector</type></title> 4017 4018 <indexterm> 4019 <primary>tsvector (data type)</primary> 4020 </indexterm> 4021 4022 <para> 4023 A <type>tsvector</type> value is a sorted list of distinct 4024 <firstterm>lexemes</firstterm>, which are words that have been 4025 <firstterm>normalized</firstterm> to merge different variants of the same word 4026 (see <xref linkend="textsearch"/> for details). Sorting and 4027 duplicate-elimination are done automatically during input, as shown in 4028 this example: 4029 4030<programlisting> 4031SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector; 4032 tsvector 4033---------------------------------------------------- 4034 'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat' 4035</programlisting> 4036 4037 To represent 4038 lexemes containing whitespace or punctuation, surround them with quotes: 4039 4040<programlisting> 4041SELECT $$the lexeme ' ' contains spaces$$::tsvector; 4042 tsvector 4043------------------------------------------- 4044 ' ' 'contains' 'lexeme' 'spaces' 'the' 4045</programlisting> 4046 4047 (We use dollar-quoted string literals in this example and the next one 4048 to avoid the confusion of having to double quote marks within the 4049 literals.) Embedded quotes and backslashes must be doubled: 4050 4051<programlisting> 4052SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector; 4053 tsvector 4054------------------------------------------------ 4055 'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the' 4056</programlisting> 4057 4058 Optionally, integer <firstterm>positions</firstterm> 4059 can be attached to lexemes: 4060 4061<programlisting> 4062SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector; 4063 tsvector 4064-------------------------------------------------------------------&zwsp;------------ 4065 'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4 4066</programlisting> 4067 4068 A position normally indicates the source word's location in the 4069 document. Positional information can be used for 4070 <firstterm>proximity ranking</firstterm>. Position values can 4071 range from 1 to 16383; larger numbers are silently set to 16383. 4072 Duplicate positions for the same lexeme are discarded. 4073 </para> 4074 4075 <para> 4076 Lexemes that have positions can further be labeled with a 4077 <firstterm>weight</firstterm>, which can be <literal>A</literal>, 4078 <literal>B</literal>, <literal>C</literal>, or <literal>D</literal>. 4079 <literal>D</literal> is the default and hence is not shown on output: 4080 4081<programlisting> 4082SELECT 'a:1A fat:2B,4C cat:5D'::tsvector; 4083 tsvector 4084---------------------------- 4085 'a':1A 'cat':5 'fat':2B,4C 4086</programlisting> 4087 4088 Weights are typically used to reflect document structure, for example 4089 by marking title words differently from body words. Text search 4090 ranking functions can assign different priorities to the different 4091 weight markers. 4092 </para> 4093 4094 <para> 4095 It is important to understand that the 4096 <type>tsvector</type> type itself does not perform any word 4097 normalization; it assumes the words it is given are normalized 4098 appropriately for the application. For example, 4099 4100<programlisting> 4101SELECT 'The Fat Rats'::tsvector; 4102 tsvector 4103-------------------- 4104 'Fat' 'Rats' 'The' 4105</programlisting> 4106 4107 For most English-text-searching applications the above words would 4108 be considered non-normalized, but <type>tsvector</type> doesn't care. 4109 Raw document text should usually be passed through 4110 <function>to_tsvector</function> to normalize the words appropriately 4111 for searching: 4112 4113<programlisting> 4114SELECT to_tsvector('english', 'The Fat Rats'); 4115 to_tsvector 4116----------------- 4117 'fat':2 'rat':3 4118</programlisting> 4119 4120 Again, see <xref linkend="textsearch"/> for more detail. 4121 </para> 4122 4123 </sect2> 4124 4125 <sect2 id="datatype-tsquery"> 4126 <title><type>tsquery</type></title> 4127 4128 <indexterm> 4129 <primary>tsquery (data type)</primary> 4130 </indexterm> 4131 4132 <para> 4133 A <type>tsquery</type> value stores lexemes that are to be 4134 searched for, and can combine them using the Boolean operators 4135 <literal>&</literal> (AND), <literal>|</literal> (OR), and 4136 <literal>!</literal> (NOT), as well as the phrase search operator 4137 <literal><-></literal> (FOLLOWED BY). There is also a variant 4138 <literal><<replaceable>N</replaceable>></literal> of the FOLLOWED BY 4139 operator, where <replaceable>N</replaceable> is an integer constant that 4140 specifies the distance between the two lexemes being searched 4141 for. <literal><-></literal> is equivalent to <literal><1></literal>. 4142 </para> 4143 4144 <para> 4145 Parentheses can be used to enforce grouping of these operators. 4146 In the absence of parentheses, <literal>!</literal> (NOT) binds most tightly, 4147 <literal><-></literal> (FOLLOWED BY) next most tightly, then 4148 <literal>&</literal> (AND), with <literal>|</literal> (OR) binding 4149 the least tightly. 4150 </para> 4151 4152 <para> 4153 Here are some examples: 4154 4155<programlisting> 4156SELECT 'fat & rat'::tsquery; 4157 tsquery 4158--------------- 4159 'fat' & 'rat' 4160 4161SELECT 'fat & (rat | cat)'::tsquery; 4162 tsquery 4163--------------------------- 4164 'fat' & ( 'rat' | 'cat' ) 4165 4166SELECT 'fat & rat & ! cat'::tsquery; 4167 tsquery 4168------------------------ 4169 'fat' & 'rat' & !'cat' 4170</programlisting> 4171 </para> 4172 4173 <para> 4174 Optionally, lexemes in a <type>tsquery</type> can be labeled with 4175 one or more weight letters, which restricts them to match only 4176 <type>tsvector</type> lexemes with one of those weights: 4177 4178<programlisting> 4179SELECT 'fat:ab & cat'::tsquery; 4180 tsquery 4181------------------ 4182 'fat':AB & 'cat' 4183</programlisting> 4184 </para> 4185 4186 <para> 4187 Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</literal> 4188 to specify prefix matching: 4189<programlisting> 4190SELECT 'super:*'::tsquery; 4191 tsquery 4192----------- 4193 'super':* 4194</programlisting> 4195 This query will match any word in a <type>tsvector</type> that begins 4196 with <quote>super</quote>. 4197 </para> 4198 4199 <para> 4200 Quoting rules for lexemes are the same as described previously for 4201 lexemes in <type>tsvector</type>; and, as with <type>tsvector</type>, 4202 any required normalization of words must be done before converting 4203 to the <type>tsquery</type> type. The <function>to_tsquery</function> 4204 function is convenient for performing such normalization: 4205 4206<programlisting> 4207SELECT to_tsquery('Fat:ab & Cats'); 4208 to_tsquery 4209------------------ 4210 'fat':AB & 'cat' 4211</programlisting> 4212 4213 Note that <function>to_tsquery</function> will process prefixes in the same way 4214 as other words, which means this comparison returns true: 4215 4216<programlisting> 4217SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' ); 4218 ?column? 4219---------- 4220 t 4221</programlisting> 4222 because <literal>postgres</literal> gets stemmed to <literal>postgr</literal>: 4223<programlisting> 4224SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' ); 4225 to_tsvector | to_tsquery 4226---------------+------------ 4227 'postgradu':1 | 'postgr':* 4228</programlisting> 4229 which will match the stemmed form of <literal>postgraduate</literal>. 4230 </para> 4231 4232 </sect2> 4233 4234 </sect1> 4235 4236 <sect1 id="datatype-uuid"> 4237 <title><acronym>UUID</acronym> Type</title> 4238 4239 <indexterm zone="datatype-uuid"> 4240 <primary>UUID</primary> 4241 </indexterm> 4242 4243 <para> 4244 The data type <type>uuid</type> stores Universally Unique Identifiers 4245 (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. 4246 (Some systems refer to this data type as a globally unique identifier, or 4247 GUID,<indexterm><primary>GUID</primary></indexterm> instead.) This 4248 identifier is a 128-bit quantity that is generated by an algorithm chosen 4249 to make it very unlikely that the same identifier will be generated by 4250 anyone else in the known universe using the same algorithm. Therefore, 4251 for distributed systems, these identifiers provide a better uniqueness 4252 guarantee than sequence generators, which 4253 are only unique within a single database. 4254 </para> 4255 4256 <para> 4257 A UUID is written as a sequence of lower-case hexadecimal digits, 4258 in several groups separated by hyphens, specifically a group of 8 4259 digits followed by three groups of 4 digits followed by a group of 4260 12 digits, for a total of 32 digits representing the 128 bits. An 4261 example of a UUID in this standard form is: 4262<programlisting> 4263a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 4264</programlisting> 4265 <productname>PostgreSQL</productname> also accepts the following 4266 alternative forms for input: 4267 use of upper-case digits, the standard format surrounded by 4268 braces, omitting some or all hyphens, adding a hyphen after any 4269 group of four digits. Examples are: 4270<programlisting> 4271A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11 4272{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11} 4273a0eebc999c0b4ef8bb6d6bb9bd380a11 4274a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11 4275{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11} 4276</programlisting> 4277 Output is always in the standard form. 4278 </para> 4279 4280 <para> 4281 See <xref linkend="functions-uuid"/> for how to generate a UUID in 4282 <productname>PostgreSQL</productname>. 4283 </para> 4284 </sect1> 4285 4286 <sect1 id="datatype-xml"> 4287 <title><acronym>XML</acronym> Type</title> 4288 4289 <indexterm zone="datatype-xml"> 4290 <primary>XML</primary> 4291 </indexterm> 4292 4293 <para> 4294 The <type>xml</type> data type can be used to store XML data. Its 4295 advantage over storing XML data in a <type>text</type> field is that it 4296 checks the input values for well-formedness, and there are support 4297 functions to perform type-safe operations on it; see <xref 4298 linkend="functions-xml"/>. Use of this data type requires the 4299 installation to have been built with <command>configure 4300 --with-libxml</command>. 4301 </para> 4302 4303 <para> 4304 The <type>xml</type> type can store well-formed 4305 <quote>documents</quote>, as defined by the XML standard, as well 4306 as <quote>content</quote> fragments, which are defined by reference 4307 to the more permissive 4308 <ulink url="https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode"><quote>document node</quote></ulink> 4309 of the XQuery and XPath data model. 4310 Roughly, this means that content fragments can have 4311 more than one top-level element or character node. The expression 4312 <literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal> 4313 can be used to evaluate whether a particular <type>xml</type> 4314 value is a full document or only a content fragment. 4315 </para> 4316 4317 <para> 4318 Limits and compatibility notes for the <type>xml</type> data type 4319 can be found in <xref linkend="xml-limits-conformance"/>. 4320 </para> 4321 4322 <sect2> 4323 <title>Creating XML Values</title> 4324 <para> 4325 To produce a value of type <type>xml</type> from character data, 4326 use the function 4327 <function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm> 4328<synopsis> 4329XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>) 4330</synopsis> 4331 Examples: 4332<programlisting><![CDATA[ 4333XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>') 4334XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>') 4335]]></programlisting> 4336 While this is the only way to convert character strings into XML 4337 values according to the SQL standard, the PostgreSQL-specific 4338 syntaxes: 4339<programlisting><![CDATA[ 4340xml '<foo>bar</foo>' 4341'<foo>bar</foo>'::xml 4342]]></programlisting> 4343 can also be used. 4344 </para> 4345 4346 <para> 4347 The <type>xml</type> type does not validate input values 4348 against a document type declaration 4349 (DTD),<indexterm><primary>DTD</primary></indexterm> 4350 even when the input value specifies a DTD. 4351 There is also currently no built-in support for validating against 4352 other XML schema languages such as XML Schema. 4353 </para> 4354 4355 <para> 4356 The inverse operation, producing a character string value from 4357 <type>xml</type>, uses the function 4358 <function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm> 4359<synopsis> 4360XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> ) 4361</synopsis> 4362 <replaceable>type</replaceable> can be 4363 <type>character</type>, <type>character varying</type>, or 4364 <type>text</type> (or an alias for one of those). Again, according 4365 to the SQL standard, this is the only way to convert between type 4366 <type>xml</type> and character types, but PostgreSQL also allows 4367 you to simply cast the value. 4368 </para> 4369 4370 <para> 4371 When a character string value is cast to or from type 4372 <type>xml</type> without going through <type>XMLPARSE</type> or 4373 <type>XMLSERIALIZE</type>, respectively, the choice of 4374 <literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is 4375 determined by the <quote>XML option</quote> 4376 <indexterm><primary>XML option</primary></indexterm> 4377 session configuration parameter, which can be set using the 4378 standard command: 4379<synopsis> 4380SET XML OPTION { DOCUMENT | CONTENT }; 4381</synopsis> 4382 or the more PostgreSQL-like syntax 4383<synopsis> 4384SET xmloption TO { DOCUMENT | CONTENT }; 4385</synopsis> 4386 The default is <literal>CONTENT</literal>, so all forms of XML 4387 data are allowed. 4388 </para> 4389 4390 </sect2> 4391 4392 <sect2> 4393 <title>Encoding Handling</title> 4394 <para> 4395 Care must be taken when dealing with multiple character encodings 4396 on the client, server, and in the XML data passed through them. 4397 When using the text mode to pass queries to the server and query 4398 results to the client (which is the normal mode), PostgreSQL 4399 converts all character data passed between the client and the 4400 server and vice versa to the character encoding of the respective 4401 end; see <xref linkend="multibyte"/>. This includes string 4402 representations of XML values, such as in the above examples. 4403 This would ordinarily mean that encoding declarations contained in 4404 XML data can become invalid as the character data is converted 4405 to other encodings while traveling between client and server, 4406 because the embedded encoding declaration is not changed. To cope 4407 with this behavior, encoding declarations contained in 4408 character strings presented for input to the <type>xml</type> type 4409 are <emphasis>ignored</emphasis>, and content is assumed 4410 to be in the current server encoding. Consequently, for correct 4411 processing, character strings of XML data must be sent 4412 from the client in the current client encoding. It is the 4413 responsibility of the client to either convert documents to the 4414 current client encoding before sending them to the server, or to 4415 adjust the client encoding appropriately. On output, values of 4416 type <type>xml</type> will not have an encoding declaration, and 4417 clients should assume all data is in the current client 4418 encoding. 4419 </para> 4420 4421 <para> 4422 When using binary mode to pass query parameters to the server 4423 and query results back to the client, no encoding conversion 4424 is performed, so the situation is different. In this case, an 4425 encoding declaration in the XML data will be observed, and if it 4426 is absent, the data will be assumed to be in UTF-8 (as required by 4427 the XML standard; note that PostgreSQL does not support UTF-16). 4428 On output, data will have an encoding declaration 4429 specifying the client encoding, unless the client encoding is 4430 UTF-8, in which case it will be omitted. 4431 </para> 4432 4433 <para> 4434 Needless to say, processing XML data with PostgreSQL will be less 4435 error-prone and more efficient if the XML data encoding, client encoding, 4436 and server encoding are the same. Since XML data is internally 4437 processed in UTF-8, computations will be most efficient if the 4438 server encoding is also UTF-8. 4439 </para> 4440 4441 <caution> 4442 <para> 4443 Some XML-related functions may not work at all on non-ASCII data 4444 when the server encoding is not UTF-8. This is known to be an 4445 issue for <function>xmltable()</function> and <function>xpath()</function> in particular. 4446 </para> 4447 </caution> 4448 </sect2> 4449 4450 <sect2> 4451 <title>Accessing XML Values</title> 4452 4453 <para> 4454 The <type>xml</type> data type is unusual in that it does not 4455 provide any comparison operators. This is because there is no 4456 well-defined and universally useful comparison algorithm for XML 4457 data. One consequence of this is that you cannot retrieve rows by 4458 comparing an <type>xml</type> column against a search value. XML 4459 values should therefore typically be accompanied by a separate key 4460 field such as an ID. An alternative solution for comparing XML 4461 values is to convert them to character strings first, but note 4462 that character string comparison has little to do with a useful 4463 XML comparison method. 4464 </para> 4465 4466 <para> 4467 Since there are no comparison operators for the <type>xml</type> 4468 data type, it is not possible to create an index directly on a 4469 column of this type. If speedy searches in XML data are desired, 4470 possible workarounds include casting the expression to a 4471 character string type and indexing that, or indexing an XPath 4472 expression. Of course, the actual query would have to be adjusted 4473 to search by the indexed expression. 4474 </para> 4475 4476 <para> 4477 The text-search functionality in PostgreSQL can also be used to speed 4478 up full-document searches of XML data. The necessary 4479 preprocessing support is, however, not yet available in the PostgreSQL 4480 distribution. 4481 </para> 4482 </sect2> 4483 </sect1> 4484 4485 &json; 4486 4487 &array; 4488 4489 &rowtypes; 4490 4491 &rangetypes; 4492 4493 <sect1 id="domains"> 4494 <title>Domain Types</title> 4495 4496 <indexterm zone="domains"> 4497 <primary>domain</primary> 4498 </indexterm> 4499 4500 <indexterm zone="domains"> 4501 <primary>data type</primary> 4502 <secondary>domain</secondary> 4503 </indexterm> 4504 4505 <para> 4506 A <firstterm>domain</firstterm> is a user-defined data type that is 4507 based on another <firstterm>underlying type</firstterm>. Optionally, 4508 it can have constraints that restrict its valid values to a subset of 4509 what the underlying type would allow. Otherwise it behaves like the 4510 underlying type — for example, any operator or function that 4511 can be applied to the underlying type will work on the domain type. 4512 The underlying type can be any built-in or user-defined base type, 4513 enum type, array type, composite type, range type, or another domain. 4514 </para> 4515 4516 <para> 4517 For example, we could create a domain over integers that accepts only 4518 positive integers: 4519<programlisting> 4520CREATE DOMAIN posint AS integer CHECK (VALUE > 0); 4521CREATE TABLE mytable (id posint); 4522INSERT INTO mytable VALUES(1); -- works 4523INSERT INTO mytable VALUES(-1); -- fails 4524</programlisting> 4525 </para> 4526 4527 <para> 4528 When an operator or function of the underlying type is applied to a 4529 domain value, the domain is automatically down-cast to the underlying 4530 type. Thus, for example, the result of <literal>mytable.id - 1</literal> 4531 is considered to be of type <type>integer</type> not <type>posint</type>. 4532 We could write <literal>(mytable.id - 1)::posint</literal> to cast the 4533 result back to <type>posint</type>, causing the domain's constraints 4534 to be rechecked. In this case, that would result in an error if the 4535 expression had been applied to an <structfield>id</structfield> value of 4536 1. Assigning a value of the underlying type to a field or variable of 4537 the domain type is allowed without writing an explicit cast, but the 4538 domain's constraints will be checked. 4539 </para> 4540 4541 <para> 4542 For additional information see <xref linkend="sql-createdomain"/>. 4543 </para> 4544 </sect1> 4545 4546 <sect1 id="datatype-oid"> 4547 <title>Object Identifier Types</title> 4548 4549 <indexterm zone="datatype-oid"> 4550 <primary>object identifier</primary> 4551 <secondary>data type</secondary> 4552 </indexterm> 4553 4554 <indexterm zone="datatype-oid"> 4555 <primary>oid</primary> 4556 </indexterm> 4557 4558 <indexterm zone="datatype-oid"> 4559 <primary>regclass</primary> 4560 </indexterm> 4561 4562 <indexterm zone="datatype-oid"> 4563 <primary>regcollation</primary> 4564 </indexterm> 4565 4566 <indexterm zone="datatype-oid"> 4567 <primary>regconfig</primary> 4568 </indexterm> 4569 4570 <indexterm zone="datatype-oid"> 4571 <primary>regdictionary</primary> 4572 </indexterm> 4573 4574 <indexterm zone="datatype-oid"> 4575 <primary>regnamespace</primary> 4576 </indexterm> 4577 4578 <indexterm zone="datatype-oid"> 4579 <primary>regoper</primary> 4580 </indexterm> 4581 4582 <indexterm zone="datatype-oid"> 4583 <primary>regoperator</primary> 4584 </indexterm> 4585 4586 <indexterm zone="datatype-oid"> 4587 <primary>regproc</primary> 4588 </indexterm> 4589 4590 <indexterm zone="datatype-oid"> 4591 <primary>regprocedure</primary> 4592 </indexterm> 4593 4594 <indexterm zone="datatype-oid"> 4595 <primary>regrole</primary> 4596 </indexterm> 4597 4598 <indexterm zone="datatype-oid"> 4599 <primary>regtype</primary> 4600 </indexterm> 4601 4602 <indexterm zone="datatype-oid"> 4603 <primary>xid8</primary> 4604 </indexterm> 4605 4606 <indexterm zone="datatype-oid"> 4607 <primary>cid</primary> 4608 </indexterm> 4609 4610 <indexterm zone="datatype-oid"> 4611 <primary>tid</primary> 4612 </indexterm> 4613 4614 <indexterm zone="datatype-oid"> 4615 <primary>xid</primary> 4616 </indexterm> 4617 4618 <para> 4619 Object identifiers (OIDs) are used internally by 4620 <productname>PostgreSQL</productname> as primary keys for various 4621 system tables. 4622 Type <type>oid</type> represents an object identifier. There are also 4623 several alias types for <type>oid</type> named <type>reg<replaceable>something</replaceable></type>. 4624 <xref linkend="datatype-oid-table"/> shows an 4625 overview. 4626 </para> 4627 4628 <para> 4629 The <type>oid</type> type is currently implemented as an unsigned 4630 four-byte integer. Therefore, it is not large enough to provide 4631 database-wide uniqueness in large databases, or even in large 4632 individual tables. 4633 </para> 4634 4635 <para> 4636 The <type>oid</type> type itself has few operations beyond comparison. 4637 It can be cast to integer, however, and then manipulated using the 4638 standard integer operators. (Beware of possible 4639 signed-versus-unsigned confusion if you do this.) 4640 </para> 4641 4642 <para> 4643 The OID alias types have no operations of their own except 4644 for specialized input and output routines. These routines are able 4645 to accept and display symbolic names for system objects, rather than 4646 the raw numeric value that type <type>oid</type> would use. The alias 4647 types allow simplified lookup of OID values for objects. For example, 4648 to examine the <structname>pg_attribute</structname> rows related to a table 4649 <literal>mytable</literal>, one could write: 4650<programlisting> 4651SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass; 4652</programlisting> 4653 rather than: 4654<programlisting> 4655SELECT * FROM pg_attribute 4656 WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable'); 4657</programlisting> 4658 While that doesn't look all that bad by itself, it's still oversimplified. 4659 A far more complicated sub-select would be needed to 4660 select the right OID if there are multiple tables named 4661 <literal>mytable</literal> in different schemas. 4662 The <type>regclass</type> input converter handles the table lookup according 4663 to the schema path setting, and so it does the <quote>right thing</quote> 4664 automatically. Similarly, casting a table's OID to 4665 <type>regclass</type> is handy for symbolic display of a numeric OID. 4666 </para> 4667 4668 <table id="datatype-oid-table"> 4669 <title>Object Identifier Types</title> 4670 <tgroup cols="4"> 4671 <thead> 4672 <row> 4673 <entry>Name</entry> 4674 <entry>References</entry> 4675 <entry>Description</entry> 4676 <entry>Value Example</entry> 4677 </row> 4678 </thead> 4679 4680 <tbody> 4681 4682 <row> 4683 <entry><type>oid</type></entry> 4684 <entry>any</entry> 4685 <entry>numeric object identifier</entry> 4686 <entry><literal>564182</literal></entry> 4687 </row> 4688 4689 <row> 4690 <entry><type>regclass</type></entry> 4691 <entry><structname>pg_class</structname></entry> 4692 <entry>relation name</entry> 4693 <entry><literal>pg_type</literal></entry> 4694 </row> 4695 4696 <row> 4697 <entry><type>regcollation</type></entry> 4698 <entry><structname>pg_collation</structname></entry> 4699 <entry>collation name</entry> 4700 <entry><literal>"POSIX"</literal></entry> 4701 </row> 4702 4703 <row> 4704 <entry><type>regconfig</type></entry> 4705 <entry><structname>pg_ts_config</structname></entry> 4706 <entry>text search configuration</entry> 4707 <entry><literal>english</literal></entry> 4708 </row> 4709 4710 <row> 4711 <entry><type>regdictionary</type></entry> 4712 <entry><structname>pg_ts_dict</structname></entry> 4713 <entry>text search dictionary</entry> 4714 <entry><literal>simple</literal></entry> 4715 </row> 4716 4717 <row> 4718 <entry><type>regnamespace</type></entry> 4719 <entry><structname>pg_namespace</structname></entry> 4720 <entry>namespace name</entry> 4721 <entry><literal>pg_catalog</literal></entry> 4722 </row> 4723 4724 <row> 4725 <entry><type>regoper</type></entry> 4726 <entry><structname>pg_operator</structname></entry> 4727 <entry>operator name</entry> 4728 <entry><literal>+</literal></entry> 4729 </row> 4730 4731 <row> 4732 <entry><type>regoperator</type></entry> 4733 <entry><structname>pg_operator</structname></entry> 4734 <entry>operator with argument types</entry> 4735 <entry><literal>*(integer,&zwsp;integer)</literal> 4736 or <literal>-(NONE,&zwsp;integer)</literal></entry> 4737 </row> 4738 4739 <row> 4740 <entry><type>regproc</type></entry> 4741 <entry><structname>pg_proc</structname></entry> 4742 <entry>function name</entry> 4743 <entry><literal>sum</literal></entry> 4744 </row> 4745 4746 <row> 4747 <entry><type>regprocedure</type></entry> 4748 <entry><structname>pg_proc</structname></entry> 4749 <entry>function with argument types</entry> 4750 <entry><literal>sum(int4)</literal></entry> 4751 </row> 4752 4753 <row> 4754 <entry><type>regrole</type></entry> 4755 <entry><structname>pg_authid</structname></entry> 4756 <entry>role name</entry> 4757 <entry><literal>smithee</literal></entry> 4758 </row> 4759 4760 <row> 4761 <entry><type>regtype</type></entry> 4762 <entry><structname>pg_type</structname></entry> 4763 <entry>data type name</entry> 4764 <entry><literal>integer</literal></entry> 4765 </row> 4766 </tbody> 4767 </tgroup> 4768 </table> 4769 4770 <para> 4771 All of the OID alias types for objects grouped by namespace accept 4772 schema-qualified names, and will 4773 display schema-qualified names on output if the object would not 4774 be found in the current search path without being qualified. 4775 The <type>regproc</type> and <type>regoper</type> alias types will only 4776 accept input names that are unique (not overloaded), so they are 4777 of limited use; for most uses <type>regprocedure</type> or 4778 <type>regoperator</type> are more appropriate. For <type>regoperator</type>, 4779 unary operators are identified by writing <literal>NONE</literal> for the unused 4780 operand. 4781 </para> 4782 4783 <para> 4784 An additional property of most of the OID alias types is the creation of 4785 dependencies. If a 4786 constant of one of these types appears in a stored expression 4787 (such as a column default expression or view), it creates a dependency 4788 on the referenced object. For example, if a column has a default 4789 expression <literal>nextval('my_seq'::regclass)</literal>, 4790 <productname>PostgreSQL</productname> 4791 understands that the default expression depends on the sequence 4792 <literal>my_seq</literal>; the system will not let the sequence be dropped 4793 without first removing the default expression. 4794 <type>regrole</type> is the only exception for the property. Constants of this 4795 type are not allowed in such expressions. 4796 </para> 4797 4798 <note> 4799 <para> 4800 The OID alias types do not completely follow transaction isolation 4801 rules. The planner also treats them as simple constants, which may 4802 result in sub-optimal planning. 4803 </para> 4804 </note> 4805 4806 <para> 4807 Another identifier type used by the system is <type>xid</type>, or transaction 4808 (abbreviated <abbrev>xact</abbrev>) identifier. This is the data type of the system columns 4809 <structfield>xmin</structfield> and <structfield>xmax</structfield>. Transaction identifiers are 32-bit quantities. 4810 In some contexts, a 64-bit variant <type>xid8</type> is used. Unlike 4811 <type>xid</type> values, <type>xid8</type> values increase strictly 4812 monotonically and cannot be reused in the lifetime of a database cluster. 4813 </para> 4814 4815 <para> 4816 A third identifier type used by the system is <type>cid</type>, or 4817 command identifier. This is the data type of the system columns 4818 <structfield>cmin</structfield> and <structfield>cmax</structfield>. Command identifiers are also 32-bit quantities. 4819 </para> 4820 4821 <para> 4822 A final identifier type used by the system is <type>tid</type>, or tuple 4823 identifier (row identifier). This is the data type of the system column 4824 <structfield>ctid</structfield>. A tuple ID is a pair 4825 (block number, tuple index within block) that identifies the 4826 physical location of the row within its table. 4827 </para> 4828 4829 <para> 4830 (The system columns are further explained in <xref 4831 linkend="ddl-system-columns"/>.) 4832 </para> 4833 </sect1> 4834 4835 <sect1 id="datatype-pg-lsn"> 4836 <title><acronym>pg_lsn Type</acronym></title> 4837 4838 <indexterm zone="datatype-pg-lsn"> 4839 <primary>pg_lsn</primary> 4840 </indexterm> 4841 4842 <para> 4843 The <type>pg_lsn</type> data type can be used to store LSN (Log Sequence 4844 Number) data which is a pointer to a location in the WAL. This type is a 4845 representation of <type>XLogRecPtr</type> and an internal system type of 4846 <productname>PostgreSQL</productname>. 4847 </para> 4848 4849 <para> 4850 Internally, an LSN is a 64-bit integer, representing a byte position in 4851 the write-ahead log stream. It is printed as two hexadecimal numbers of 4852 up to 8 digits each, separated by a slash; for example, 4853 <literal>16/B374D848</literal>. The <type>pg_lsn</type> type supports the 4854 standard comparison operators, like <literal>=</literal> and 4855 <literal>></literal>. Two LSNs can be subtracted using the 4856 <literal>-</literal> operator; the result is the number of bytes separating 4857 those write-ahead log locations. 4858 </para> 4859 </sect1> 4860 4861 <sect1 id="datatype-pseudo"> 4862 <title>Pseudo-Types</title> 4863 4864 <indexterm zone="datatype-pseudo"> 4865 <primary>record</primary> 4866 </indexterm> 4867 4868 <indexterm zone="datatype-pseudo"> 4869 <primary>any</primary> 4870 </indexterm> 4871 4872 <indexterm zone="datatype-pseudo"> 4873 <primary>anyelement</primary> 4874 </indexterm> 4875 4876 <indexterm zone="datatype-pseudo"> 4877 <primary>anyarray</primary> 4878 </indexterm> 4879 4880 <indexterm zone="datatype-pseudo"> 4881 <primary>anynonarray</primary> 4882 </indexterm> 4883 4884 <indexterm zone="datatype-pseudo"> 4885 <primary>anyenum</primary> 4886 </indexterm> 4887 4888 <indexterm zone="datatype-pseudo"> 4889 <primary>anyrange</primary> 4890 </indexterm> 4891 4892 <indexterm zone="datatype-pseudo"> 4893 <primary>anycompatible</primary> 4894 </indexterm> 4895 4896 <indexterm zone="datatype-pseudo"> 4897 <primary>anycompatiblearray</primary> 4898 </indexterm> 4899 4900 <indexterm zone="datatype-pseudo"> 4901 <primary>anycompatiblenonarray</primary> 4902 </indexterm> 4903 4904 <indexterm zone="datatype-pseudo"> 4905 <primary>anycompatiblerange</primary> 4906 </indexterm> 4907 4908 <indexterm zone="datatype-pseudo"> 4909 <primary>void</primary> 4910 </indexterm> 4911 4912 <indexterm zone="datatype-pseudo"> 4913 <primary>trigger</primary> 4914 </indexterm> 4915 4916 <indexterm zone="datatype-pseudo"> 4917 <primary>event_trigger</primary> 4918 </indexterm> 4919 4920 <indexterm zone="datatype-pseudo"> 4921 <primary>pg_ddl_command</primary> 4922 </indexterm> 4923 4924 <indexterm zone="datatype-pseudo"> 4925 <primary>language_handler</primary> 4926 </indexterm> 4927 4928 <indexterm zone="datatype-pseudo"> 4929 <primary>fdw_handler</primary> 4930 </indexterm> 4931 4932 <indexterm zone="datatype-pseudo"> 4933 <primary>table_am_handler</primary> 4934 </indexterm> 4935 4936 <indexterm zone="datatype-pseudo"> 4937 <primary>index_am_handler</primary> 4938 </indexterm> 4939 4940 <indexterm zone="datatype-pseudo"> 4941 <primary>tsm_handler</primary> 4942 </indexterm> 4943 4944 <indexterm zone="datatype-pseudo"> 4945 <primary>cstring</primary> 4946 </indexterm> 4947 4948 <indexterm zone="datatype-pseudo"> 4949 <primary>internal</primary> 4950 </indexterm> 4951 4952 <indexterm zone="datatype-pseudo"> 4953 <primary>unknown</primary> 4954 </indexterm> 4955 4956 <para> 4957 The <productname>PostgreSQL</productname> type system contains a 4958 number of special-purpose entries that are collectively called 4959 <firstterm>pseudo-types</firstterm>. A pseudo-type cannot be used as a 4960 column data type, but it can be used to declare a function's 4961 argument or result type. Each of the available pseudo-types is 4962 useful in situations where a function's behavior does not 4963 correspond to simply taking or returning a value of a specific 4964 <acronym>SQL</acronym> data type. <xref 4965 linkend="datatype-pseudotypes-table"/> lists the existing 4966 pseudo-types. 4967 </para> 4968 4969 <table id="datatype-pseudotypes-table"> 4970 <title>Pseudo-Types</title> 4971 <tgroup cols="2"> 4972 <colspec colname="col1" colwidth="1*"/> 4973 <colspec colname="col2" colwidth="2*"/> 4974 <thead> 4975 <row> 4976 <entry>Name</entry> 4977 <entry>Description</entry> 4978 </row> 4979 </thead> 4980 4981 <tbody> 4982 <row> 4983 <entry><type>any</type></entry> 4984 <entry>Indicates that a function accepts any input data type.</entry> 4985 </row> 4986 4987 <row> 4988 <entry><type>anyelement</type></entry> 4989 <entry>Indicates that a function accepts any data type 4990 (see <xref linkend="extend-types-polymorphic"/>).</entry> 4991 </row> 4992 4993 <row> 4994 <entry><type>anyarray</type></entry> 4995 <entry>Indicates that a function accepts any array data type 4996 (see <xref linkend="extend-types-polymorphic"/>).</entry> 4997 </row> 4998 4999 <row> 5000 <entry><type>anynonarray</type></entry> 5001 <entry>Indicates that a function accepts any non-array data type 5002 (see <xref linkend="extend-types-polymorphic"/>).</entry> 5003 </row> 5004 5005 <row> 5006 <entry><type>anyenum</type></entry> 5007 <entry>Indicates that a function accepts any enum data type 5008 (see <xref linkend="extend-types-polymorphic"/> and 5009 <xref linkend="datatype-enum"/>).</entry> 5010 </row> 5011 5012 <row> 5013 <entry><type>anyrange</type></entry> 5014 <entry>Indicates that a function accepts any range data type 5015 (see <xref linkend="extend-types-polymorphic"/> and 5016 <xref linkend="rangetypes"/>).</entry> 5017 </row> 5018 5019 <row> 5020 <entry><type>anycompatible</type></entry> 5021 <entry>Indicates that a function accepts any data type, 5022 with automatic promotion of multiple arguments to a common data type 5023 (see <xref linkend="extend-types-polymorphic"/>).</entry> 5024 </row> 5025 5026 <row> 5027 <entry><type>anycompatiblearray</type></entry> 5028 <entry>Indicates that a function accepts any array data type, 5029 with automatic promotion of multiple arguments to a common data type 5030 (see <xref linkend="extend-types-polymorphic"/>).</entry> 5031 </row> 5032 5033 <row> 5034 <entry><type>anycompatiblenonarray</type></entry> 5035 <entry>Indicates that a function accepts any non-array data type, 5036 with automatic promotion of multiple arguments to a common data type 5037 (see <xref linkend="extend-types-polymorphic"/>).</entry> 5038 </row> 5039 5040 <row> 5041 <entry><type>anycompatiblerange</type></entry> 5042 <entry>Indicates that a function accepts any range data type, 5043 with automatic promotion of multiple arguments to a common data type 5044 (see <xref linkend="extend-types-polymorphic"/> and 5045 <xref linkend="rangetypes"/>).</entry> 5046 </row> 5047 5048 <row> 5049 <entry><type>cstring</type></entry> 5050 <entry>Indicates that a function accepts or returns a null-terminated C string.</entry> 5051 </row> 5052 5053 <row> 5054 <entry><type>internal</type></entry> 5055 <entry>Indicates that a function accepts or returns a server-internal 5056 data type.</entry> 5057 </row> 5058 5059 <row> 5060 <entry><type>language_handler</type></entry> 5061 <entry>A procedural language call handler is declared to return <type>language_handler</type>.</entry> 5062 </row> 5063 5064 <row> 5065 <entry><type>fdw_handler</type></entry> 5066 <entry>A foreign-data wrapper handler is declared to return <type>fdw_handler</type>.</entry> 5067 </row> 5068 5069 <row> 5070 <entry><type>table_am_handler</type></entry> 5071 <entry>A table access method handler is declared to return <type>table_am_handler</type>.</entry> 5072 </row> 5073 5074 <row> 5075 <entry><type>index_am_handler</type></entry> 5076 <entry>An index access method handler is declared to return <type>index_am_handler</type>.</entry> 5077 </row> 5078 5079 <row> 5080 <entry><type>tsm_handler</type></entry> 5081 <entry>A tablesample method handler is declared to return <type>tsm_handler</type>.</entry> 5082 </row> 5083 5084 <row> 5085 <entry><type>record</type></entry> 5086 <entry>Identifies a function taking or returning an unspecified row type.</entry> 5087 </row> 5088 5089 <row> 5090 <entry><type>trigger</type></entry> 5091 <entry>A trigger function is declared to return <type>trigger.</type></entry> 5092 </row> 5093 5094 <row> 5095 <entry><type>event_trigger</type></entry> 5096 <entry>An event trigger function is declared to return <type>event_trigger.</type></entry> 5097 </row> 5098 5099 <row> 5100 <entry><type>pg_ddl_command</type></entry> 5101 <entry>Identifies a representation of DDL commands that is available to event triggers.</entry> 5102 </row> 5103 5104 <row> 5105 <entry><type>void</type></entry> 5106 <entry>Indicates that a function returns no value.</entry> 5107 </row> 5108 5109 <row> 5110 <entry><type>unknown</type></entry> 5111 <entry>Identifies a not-yet-resolved type, e.g., of an undecorated 5112 string literal.</entry> 5113 </row> 5114 </tbody> 5115 </tgroup> 5116 </table> 5117 5118 <para> 5119 Functions coded in C (whether built-in or dynamically loaded) can be 5120 declared to accept or return any of these pseudo-types. It is up to 5121 the function author to ensure that the function will behave safely 5122 when a pseudo-type is used as an argument type. 5123 </para> 5124 5125 <para> 5126 Functions coded in procedural languages can use pseudo-types only as 5127 allowed by their implementation languages. At present most procedural 5128 languages forbid use of a pseudo-type as an argument type, and allow 5129 only <type>void</type> and <type>record</type> as a result type (plus 5130 <type>trigger</type> or <type>event_trigger</type> when the function is used 5131 as a trigger or event trigger). Some also support polymorphic functions 5132 using the polymorphic pseudo-types, which are shown above and discussed 5133 in detail in <xref linkend="extend-types-polymorphic"/>. 5134 </para> 5135 5136 <para> 5137 The <type>internal</type> pseudo-type is used to declare functions 5138 that are meant only to be called internally by the database 5139 system, and not by direct invocation in an <acronym>SQL</acronym> 5140 query. If a function has at least one <type>internal</type>-type 5141 argument then it cannot be called from <acronym>SQL</acronym>. To 5142 preserve the type safety of this restriction it is important to 5143 follow this coding rule: do not create any function that is 5144 declared to return <type>internal</type> unless it has at least one 5145 <type>internal</type> argument. 5146 </para> 5147 5148 </sect1> 5149 5150 </chapter> 5151