1<!-- doc/src/sgml/rangetypes.sgml -->
2
3<sect1 id="rangetypes">
4 <title>Range Types</title>
5
6 <indexterm>
7  <primary>range type</primary>
8 </indexterm>
9
10 <para>
11  Range types are data types representing a range of values of some
12  element type (called the range's <firstterm>subtype</firstterm>).
13  For instance, ranges
14  of <type>timestamp</type> might be used to represent the ranges of
15  time that a meeting room is reserved. In this case the data type
16  is <type>tsrange</type> (short for <quote>timestamp range</quote>),
17  and <type>timestamp</type> is the subtype.  The subtype must have
18  a total order so that it is well-defined whether element values are
19  within, before, or after a range of values.
20 </para>
21
22 <para>
23  Range types are useful because they represent many element values in a
24  single range value, and because concepts such as overlapping ranges can
25  be expressed clearly. The use of time and date ranges for scheduling
26  purposes is the clearest example; but price ranges, measurement
27  ranges from an instrument, and so forth can also be useful.
28 </para>
29
30 <sect2 id="rangetypes-builtin">
31  <title>Built-in Range Types</title>
32
33 <para>
34  PostgreSQL comes with the following built-in range types:
35  <itemizedlist>
36    <listitem>
37      <para>
38       <type>int4range</type> &mdash; Range of <type>integer</type>
39      </para>
40    </listitem>
41    <listitem>
42      <para>
43       <type>int8range</type> &mdash; Range of <type>bigint</type>
44      </para>
45    </listitem>
46    <listitem>
47      <para>
48       <type>numrange</type> &mdash; Range of <type>numeric</type>
49      </para>
50    </listitem>
51    <listitem>
52      <para>
53       <type>tsrange</type> &mdash; Range of <type>timestamp without time zone</type>
54      </para>
55    </listitem>
56    <listitem>
57      <para>
58       <type>tstzrange</type> &mdash; Range of <type>timestamp with time zone</type>
59      </para>
60    </listitem>
61    <listitem>
62      <para>
63       <type>daterange</type> &mdash; Range of <type>date</type>
64      </para>
65    </listitem>
66  </itemizedlist>
67  In addition, you can define your own range types;
68  see <xref linkend="sql-createtype"/> for more information.
69 </para>
70 </sect2>
71
72 <sect2 id="rangetypes-examples">
73  <title>Examples</title>
74
75  <para>
76<programlisting>
77CREATE TABLE reservation (room int, during tsrange);
78INSERT INTO reservation VALUES
79    (1108, '[2010-01-01 14:30, 2010-01-01 15:30)');
80
81-- Containment
82SELECT int4range(10, 20) @&gt; 3;
83
84-- Overlaps
85SELECT numrange(11.1, 22.2) &amp;&amp; numrange(20.0, 30.0);
86
87-- Extract the upper bound
88SELECT upper(int8range(15, 25));
89
90-- Compute the intersection
91SELECT int4range(10, 20) * int4range(15, 25);
92
93-- Is the range empty?
94SELECT isempty(numrange(1, 5));
95</programlisting>
96
97   See <xref linkend="range-operators-table"/>
98   and <xref linkend="range-functions-table"/> for complete lists of
99   operators and functions on range types.
100  </para>
101 </sect2>
102
103 <sect2 id="rangetypes-inclusivity">
104  <title>Inclusive and Exclusive Bounds</title>
105
106  <para>
107   Every non-empty range has two bounds, the lower bound and the upper
108   bound. All points between these values are included in the range. An
109   inclusive bound means that the boundary point itself is included in
110   the range as well, while an exclusive bound means that the boundary
111   point is not included in the range.
112  </para>
113
114  <para>
115   In the text form of a range, an inclusive lower bound is represented by
116   <quote><literal>[</literal></quote> while an exclusive lower bound is
117   represented by <quote><literal>(</literal></quote>. Likewise, an inclusive upper bound is represented by
118   <quote><literal>]</literal></quote>, while an exclusive upper bound is
119   represented by <quote><literal>)</literal></quote>.
120   (See <xref linkend="rangetypes-io"/> for more details.)
121  </para>
122
123  <para>
124   The functions <literal>lower_inc</literal>
125   and <literal>upper_inc</literal> test the inclusivity of the lower
126   and upper bounds of a range value, respectively.
127  </para>
128 </sect2>
129
130 <sect2 id="rangetypes-infinite">
131  <title>Infinite (Unbounded) Ranges</title>
132
133  <para>
134   The lower bound of a range can be omitted, meaning that all
135   values less than the upper bound are included in the range, e.g.,
136   <literal>(,3]</literal>. Likewise, if the upper bound of the range
137   is omitted, then all values greater than the lower bound are included
138   in the range. If both lower and upper bounds are omitted, all values
139   of the element type are considered to be in the range.  Specifying a
140   missing bound as inclusive is automatically converted to exclusive,
141   e.g., <literal>[,]</literal> is converted to <literal>(,)</literal>.
142   You can think of these missing values as +/-infinity, but they are
143   special range type values and are considered to be beyond any range
144   element type's +/-infinity values.
145  </para>
146
147  <para>
148   Element types that have the notion of <quote>infinity</quote> can
149   use them as explicit bound values.  For example, with timestamp
150   ranges, <literal>[today,infinity)</literal> excludes the special
151   <type>timestamp</type> value <literal>infinity</literal>,
152   while <literal>[today,infinity]</literal> include it, as does
153   <literal>[today,)</literal> and <literal>[today,]</literal>.
154  </para>
155
156  <para>
157   The functions <literal>lower_inf</literal>
158   and <literal>upper_inf</literal> test for infinite lower
159   and upper bounds of a range, respectively.
160  </para>
161 </sect2>
162
163 <sect2 id="rangetypes-io">
164  <title>Range Input/Output</title>
165
166  <para>
167   The input for a range value must follow one of the following patterns:
168<synopsis>
169(<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>)
170(<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>]
171[<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>)
172[<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>]
173empty
174</synopsis>
175   The parentheses or brackets indicate whether the lower and upper bounds
176   are exclusive or inclusive, as described previously.
177   Notice that the final pattern is <literal>empty</literal>, which
178   represents an empty range (a range that contains no points).
179  </para>
180
181  <para>
182   The <replaceable>lower-bound</replaceable> may be either a string
183   that is valid input for the subtype, or empty to indicate no
184   lower bound.  Likewise, <replaceable>upper-bound</replaceable> may be
185   either a string that is valid input for the subtype, or empty to
186   indicate no upper bound.
187  </para>
188
189  <para>
190   Each bound value can be quoted using <literal>"</literal> (double quote)
191   characters.  This is necessary if the bound value contains parentheses,
192   brackets, commas, double quotes, or backslashes, since these characters
193   would otherwise be taken as part of the range syntax.  To put a double
194   quote or backslash in a quoted bound value, precede it with a
195   backslash. (Also, a pair of double quotes within a double-quoted bound
196   value is taken to represent a double quote character, analogously to the
197   rules for single quotes in SQL literal strings.) Alternatively, you can
198   avoid quoting and use backslash-escaping to protect all data characters
199   that would otherwise be taken as range syntax.  Also, to write a bound
200   value that is an empty string, write <literal>""</literal>, since writing
201   nothing means an infinite bound.
202  </para>
203
204  <para>
205   Whitespace is allowed before and after the range value, but any whitespace
206   between the parentheses or brackets is taken as part of the lower or upper
207   bound value.  (Depending on the element type, it might or might not be
208   significant.)
209  </para>
210
211  <note>
212   <para>
213    These rules are very similar to those for writing field values in
214    composite-type literals.  See <xref linkend="rowtypes-io-syntax"/> for
215    additional commentary.
216   </para>
217  </note>
218
219  <para>
220  Examples:
221<programlisting>
222-- includes 3, does not include 7, and does include all points in between
223SELECT '[3,7)'::int4range;
224
225-- does not include either 3 or 7, but includes all points in between
226SELECT '(3,7)'::int4range;
227
228-- includes only the single point 4
229SELECT '[4,4]'::int4range;
230
231-- includes no points (and will be normalized to 'empty')
232SELECT '[4,4)'::int4range;
233</programlisting>
234  </para>
235 </sect2>
236
237 <sect2 id="rangetypes-construct">
238  <title>Constructing Ranges</title>
239
240  <para>
241   Each range type has a constructor function with the same name as the range
242   type.  Using the constructor function is frequently more convenient than
243   writing a range literal constant, since it avoids the need for extra
244   quoting of the bound values.  The constructor function
245   accepts two or three arguments.  The two-argument form constructs a range
246   in standard form (lower bound inclusive, upper bound exclusive), while
247   the three-argument form constructs a range with bounds of the form
248   specified by the third argument.
249   The third argument must be one of the strings
250   <quote><literal>()</literal></quote>,
251   <quote><literal>(]</literal></quote>,
252   <quote><literal>[)</literal></quote>, or
253   <quote><literal>[]</literal></quote>.
254   For example:
255
256<programlisting>
257-- The full form is: lower bound, upper bound, and text argument indicating
258-- inclusivity/exclusivity of bounds.
259SELECT numrange(1.0, 14.0, '(]');
260
261-- If the third argument is omitted, '[)' is assumed.
262SELECT numrange(1.0, 14.0);
263
264-- Although '(]' is specified here, on display the value will be converted to
265-- canonical form, since int8range is a discrete range type (see below).
266SELECT int8range(1, 14, '(]');
267
268-- Using NULL for either bound causes the range to be unbounded on that side.
269SELECT numrange(NULL, 2.2);
270</programlisting>
271  </para>
272 </sect2>
273
274 <sect2 id="rangetypes-discrete">
275  <title>Discrete Range Types</title>
276
277  <para>
278   A discrete range is one whose element type has a well-defined
279   <quote>step</quote>, such as <type>integer</type> or <type>date</type>.
280   In these types two elements can be said to be adjacent, when there are
281   no valid values between them.  This contrasts with continuous ranges,
282   where it's always (or almost always) possible to identify other element
283   values between two given values.  For example, a range over the
284   <type>numeric</type> type is continuous, as is a range over <type>timestamp</type>.
285   (Even though <type>timestamp</type> has limited precision, and so could
286   theoretically be treated as discrete, it's better to consider it continuous
287   since the step size is normally not of interest.)
288  </para>
289
290  <para>
291   Another way to think about a discrete range type is that there is a clear
292   idea of a <quote>next</quote> or <quote>previous</quote> value for each element value.
293   Knowing that, it is possible to convert between inclusive and exclusive
294   representations of a range's bounds, by choosing the next or previous
295   element value instead of the one originally given.
296   For example, in an integer range type <literal>[4,8]</literal> and
297   <literal>(3,9)</literal> denote the same set of values; but this would not be so
298   for a range over numeric.
299  </para>
300
301  <para>
302   A discrete range type should have a <firstterm>canonicalization</firstterm>
303   function that is aware of the desired step size for the element type.
304   The canonicalization function is charged with converting equivalent values
305   of the range type to have identical representations, in particular
306   consistently inclusive or exclusive bounds.
307   If a canonicalization function is not specified, then ranges with different
308   formatting will always be treated as unequal, even though they might
309   represent the same set of values in reality.
310  </para>
311
312  <para>
313   The built-in range types <type>int4range</type>, <type>int8range</type>,
314   and <type>daterange</type> all use a canonical form that includes
315   the lower bound and excludes the upper bound; that is,
316   <literal>[)</literal>. User-defined range types can use other conventions,
317   however.
318  </para>
319 </sect2>
320
321 <sect2 id="rangetypes-defining">
322  <title>Defining New Range Types</title>
323
324  <para>
325   Users can define their own range types. The most common reason to do
326   this is to use ranges over subtypes not provided among the built-in
327   range types.
328   For example, to define a new range type of subtype <type>float8</type>:
329
330<programlisting>
331CREATE TYPE floatrange AS RANGE (
332    subtype = float8,
333    subtype_diff = float8mi
334);
335
336SELECT '[1.234, 5.678]'::floatrange;
337</programlisting>
338
339   Because <type>float8</type> has no meaningful
340   <quote>step</quote>, we do not define a canonicalization
341   function in this example.
342  </para>
343
344  <para>
345   Defining your own range type also allows you to specify a different
346   subtype B-tree operator class or collation to use, so as to change the sort
347   ordering that determines which values fall into a given range.
348  </para>
349
350  <para>
351   If the subtype is considered to have discrete rather than continuous
352   values, the <command>CREATE TYPE</command> command should specify a
353   <literal>canonical</literal> function.
354   The canonicalization function takes an input range value, and must return
355   an equivalent range value that may have different bounds and formatting.
356   The canonical output for two ranges that represent the same set of values,
357   for example the integer ranges <literal>[1, 7]</literal> and <literal>[1,
358   8)</literal>, must be identical.  It doesn't matter which representation
359   you choose to be the canonical one, so long as two equivalent values with
360   different formattings are always mapped to the same value with the same
361   formatting.  In addition to adjusting the inclusive/exclusive bounds
362   format, a canonicalization function might round off boundary values, in
363   case the desired step size is larger than what the subtype is capable of
364   storing.  For instance, a range type over <type>timestamp</type> could be
365   defined to have a step size of an hour, in which case the canonicalization
366   function would need to round off bounds that weren't a multiple of an hour,
367   or perhaps throw an error instead.
368  </para>
369
370  <para>
371   In addition, any range type that is meant to be used with GiST or SP-GiST
372   indexes should define a subtype difference, or <literal>subtype_diff</literal>,
373   function.  (The index will still work without <literal>subtype_diff</literal>,
374   but it is likely to be considerably less efficient than if a difference
375   function is provided.)  The subtype difference function takes two input
376   values of the subtype, and returns their difference
377   (i.e., <replaceable>X</replaceable> minus <replaceable>Y</replaceable>) represented as
378   a <type>float8</type> value.  In our example above, the
379   function <function>float8mi</function> that underlies the regular <type>float8</type>
380   minus operator can be used; but for any other subtype, some type
381   conversion would be necessary.  Some creative thought about how to
382   represent differences as numbers might be needed, too.  To the greatest
383   extent possible, the <literal>subtype_diff</literal> function should agree with
384   the sort ordering implied by the selected operator class and collation;
385   that is, its result should be positive whenever its first argument is
386   greater than its second according to the sort ordering.
387  </para>
388
389  <para>
390   A less-oversimplified example of a <literal>subtype_diff</literal> function is:
391  </para>
392
393<programlisting>
394CREATE FUNCTION time_subtype_diff(x time, y time) RETURNS float8 AS
395'SELECT EXTRACT(EPOCH FROM (x - y))' LANGUAGE sql STRICT IMMUTABLE;
396
397CREATE TYPE timerange AS RANGE (
398    subtype = time,
399    subtype_diff = time_subtype_diff
400);
401
402SELECT '[11:10, 23:00]'::timerange;
403</programlisting>
404
405  <para>
406   See <xref linkend="sql-createtype"/> for more information about creating
407   range types.
408  </para>
409 </sect2>
410
411 <sect2 id="rangetypes-indexing">
412  <title>Indexing</title>
413
414  <indexterm>
415    <primary>range type</primary>
416    <secondary>indexes on</secondary>
417  </indexterm>
418
419  <para>
420   GiST and SP-GiST indexes can be created for table columns of range types.
421   For instance, to create a GiST index:
422<programlisting>
423CREATE INDEX reservation_idx ON reservation USING GIST (during);
424</programlisting>
425   A GiST or SP-GiST index can accelerate queries involving these range operators:
426   <literal>=</literal>,
427   <literal>&amp;&amp;</literal>,
428   <literal>&lt;@</literal>,
429   <literal>@&gt;</literal>,
430   <literal>&lt;&lt;</literal>,
431   <literal>&gt;&gt;</literal>,
432   <literal>-|-</literal>,
433   <literal>&amp;&lt;</literal>, and
434   <literal>&amp;&gt;</literal>
435   (see <xref linkend="range-operators-table"/> for more information).
436  </para>
437
438  <para>
439   In addition, B-tree and hash indexes can be created for table columns of
440   range types.  For these index types, basically the only useful range
441   operation is equality.  There is a B-tree sort ordering defined for range
442   values, with corresponding <literal>&lt;</literal> and <literal>&gt;</literal> operators,
443   but the ordering is rather arbitrary and not usually useful in the real
444   world.  Range types' B-tree and hash support is primarily meant to
445   allow sorting and hashing internally in queries, rather than creation of
446   actual indexes.
447  </para>
448 </sect2>
449
450 <sect2 id="rangetypes-constraint">
451  <title>Constraints on Ranges</title>
452
453  <indexterm>
454    <primary>range type</primary>
455    <secondary>exclude</secondary>
456  </indexterm>
457
458  <para>
459   While <literal>UNIQUE</literal> is a natural constraint for scalar
460   values, it is usually unsuitable for range types. Instead, an
461   exclusion constraint is often more appropriate
462   (see <link linkend="sql-createtable-exclude">CREATE TABLE
463   ... CONSTRAINT ... EXCLUDE</link>). Exclusion constraints allow the
464   specification of constraints such as <quote>non-overlapping</quote> on a
465   range type. For example:
466
467<programlisting>
468CREATE TABLE reservation (
469    during tsrange,
470    EXCLUDE USING GIST (during WITH &amp;&amp;)
471);
472</programlisting>
473
474   That constraint will prevent any overlapping values from existing
475   in the table at the same time:
476
477<programlisting>
478INSERT INTO reservation VALUES
479    ('[2010-01-01 11:30, 2010-01-01 15:00)');
480INSERT 0 1
481
482INSERT INTO reservation VALUES
483    ('[2010-01-01 14:45, 2010-01-01 15:45)');
484ERROR:  conflicting key value violates exclusion constraint "reservation_during_excl"
485DETAIL:  Key (during)=(["2010-01-01 14:45:00","2010-01-01 15:45:00")) conflicts
486with existing key (during)=(["2010-01-01 11:30:00","2010-01-01 15:00:00")).
487</programlisting>
488  </para>
489
490  <para>
491   You can use the <link linkend="btree-gist"><literal>btree_gist</literal></link>
492   extension to define exclusion constraints on plain scalar data types, which
493   can then be combined with range exclusions for maximum flexibility.  For
494   example, after <literal>btree_gist</literal> is installed, the following
495   constraint will reject overlapping ranges only if the meeting room numbers
496   are equal:
497
498<programlisting>
499CREATE EXTENSION btree_gist;
500CREATE TABLE room_reservation (
501    room text,
502    during tsrange,
503    EXCLUDE USING GIST (room WITH =, during WITH &amp;&amp;)
504);
505
506INSERT INTO room_reservation VALUES
507    ('123A', '[2010-01-01 14:00, 2010-01-01 15:00)');
508INSERT 0 1
509
510INSERT INTO room_reservation VALUES
511    ('123A', '[2010-01-01 14:30, 2010-01-01 15:30)');
512ERROR:  conflicting key value violates exclusion constraint "room_reservation_room_during_excl"
513DETAIL:  Key (room, during)=(123A, ["2010-01-01 14:30:00","2010-01-01 15:30:00")) conflicts
514with existing key (room, during)=(123A, ["2010-01-01 14:00:00","2010-01-01 15:00:00")).
515
516INSERT INTO room_reservation VALUES
517    ('123B', '[2010-01-01 14:30, 2010-01-01 15:30)');
518INSERT 0 1
519</programlisting>
520  </para>
521 </sect2>
522</sect1>
523