1<!-- doc/src/sgml/json.sgml -->
2
3<sect1 id="datatype-json">
4 <title><acronym>JSON</acronym> Types</title>
5
6 <indexterm zone="datatype-json">
7  <primary>JSON</primary>
8 </indexterm>
9
10 <indexterm zone="datatype-json">
11  <primary>JSONB</primary>
12 </indexterm>
13
14 <para>
15  JSON data types are for storing JSON (JavaScript Object Notation)
16  data, as specified in <ulink url="https://tools.ietf.org/html/rfc7159">RFC
17  7159</ulink>. Such data can also be stored as <type>text</type>, but
18  the JSON data types have the advantage of enforcing that each
19  stored value is valid according to the JSON rules.  There are also
20  assorted JSON-specific functions and operators available for data stored
21  in these data types; see <xref linkend="functions-json"/>.
22 </para>
23
24 <para>
25  <productname>PostgreSQL</productname> offers two types for storing JSON
26  data: <type>json</type> and <type>jsonb</type>. To implement efficient query
27  mechanisms for these data types, <productname>PostgreSQL</productname>
28  also provides the <type>jsonpath</type> data type described in
29  <xref linkend="datatype-jsonpath"/>.
30 </para>
31
32 <para>
33  The <type>json</type> and <type>jsonb</type> data types
34  accept <emphasis>almost</emphasis> identical sets of values as
35  input.  The major practical difference is one of efficiency.  The
36  <type>json</type> data type stores an exact copy of the input text,
37  which processing functions must reparse on each execution; while
38  <type>jsonb</type> data is stored in a decomposed binary format that
39  makes it slightly slower to input due to added conversion
40  overhead, but significantly faster to process, since no reparsing
41  is needed.  <type>jsonb</type> also supports indexing, which can be a
42  significant advantage.
43 </para>
44
45 <para>
46  Because the <type>json</type> type stores an exact copy of the input text, it
47  will preserve semantically-insignificant white space between tokens, as
48  well as the order of keys within JSON objects. Also, if a JSON object
49  within the value contains the same key more than once, all the key/value
50  pairs are kept.  (The processing functions consider the last value as the
51  operative one.)  By contrast, <type>jsonb</type> does not preserve white
52  space, does not preserve the order of object keys, and does not keep
53  duplicate object keys.  If duplicate keys are specified in the input,
54  only the last value is kept.
55 </para>
56
57 <para>
58  In general, most applications should prefer to store JSON data as
59  <type>jsonb</type>, unless there are quite specialized needs, such as
60  legacy assumptions about ordering of object keys.
61 </para>
62
63 <para>
64  RFC 7159 specifies that JSON strings should be encoded in UTF8.
65  It is therefore not possible for the JSON
66  types to conform rigidly to the JSON specification unless the database
67  encoding is UTF8. Attempts to directly include characters that
68  cannot be represented in the database encoding will fail; conversely,
69  characters that can be represented in the database encoding but not
70  in UTF8 will be allowed.
71 </para>
72
73 <para>
74  RFC 7159 permits JSON strings to contain Unicode escape sequences
75  denoted by <literal>\u<replaceable>XXXX</replaceable></literal>.  In the input
76  function for the <type>json</type> type, Unicode escapes are allowed
77  regardless of the database encoding, and are checked only for syntactic
78  correctness (that is, that four hex digits follow <literal>\u</literal>).
79  However, the input function for <type>jsonb</type> is stricter: it disallows
80  Unicode escapes for characters that cannot be represented in the database
81  encoding.  The <type>jsonb</type> type also
82  rejects <literal>\u0000</literal> (because that cannot be represented in
83  <productname>PostgreSQL</productname>'s <type>text</type> type), and it insists
84  that any use of Unicode surrogate pairs to designate characters outside
85  the Unicode Basic Multilingual Plane be correct.  Valid Unicode escapes
86  are converted to the equivalent single character for storage;
87  this includes folding surrogate pairs into a single character.
88 </para>
89
90 <note>
91  <para>
92   Many of the JSON processing functions described
93   in <xref linkend="functions-json"/> will convert Unicode escapes to
94   regular characters, and will therefore throw the same types of errors
95   just described even if their input is of type <type>json</type>
96   not <type>jsonb</type>. The fact that the <type>json</type> input function does
97   not make these checks may be considered a historical artifact, although
98   it does allow for simple storage (without processing) of JSON Unicode
99   escapes in a database encoding that does not support the represented
100   characters.
101  </para>
102 </note>
103
104 <para>
105  When converting textual JSON input into <type>jsonb</type>, the primitive
106  types described by <acronym>RFC</acronym> 7159 are effectively mapped onto
107  native <productname>PostgreSQL</productname> types, as shown
108  in <xref linkend="json-type-mapping-table"/>.
109  Therefore, there are some minor additional constraints on what
110  constitutes valid <type>jsonb</type> data that do not apply to
111  the <type>json</type> type, nor to JSON in the abstract, corresponding
112  to limits on what can be represented by the underlying data type.
113  Notably, <type>jsonb</type> will reject numbers that are outside the
114  range of the <productname>PostgreSQL</productname> <type>numeric</type> data
115  type, while <type>json</type> will not.  Such implementation-defined
116  restrictions are permitted by <acronym>RFC</acronym> 7159.  However, in
117  practice such problems are far more likely to occur in other
118  implementations, as it is common to represent JSON's <type>number</type>
119  primitive type as IEEE 754 double precision floating point
120  (which <acronym>RFC</acronym> 7159 explicitly anticipates and allows for).
121  When using JSON as an interchange format with such systems, the danger
122  of losing numeric precision compared to data originally stored
123  by <productname>PostgreSQL</productname> should be considered.
124 </para>
125
126 <para>
127  Conversely, as noted in the table there are some minor restrictions on
128  the input format of JSON primitive types that do not apply to
129  the corresponding <productname>PostgreSQL</productname> types.
130 </para>
131
132  <table id="json-type-mapping-table">
133     <title>JSON Primitive Types and Corresponding <productname>PostgreSQL</productname> Types</title>
134     <tgroup cols="3">
135      <colspec colname="col1" colwidth="1*"/>
136      <colspec colname="col2" colwidth="1*"/>
137      <colspec colname="col3" colwidth="2*"/>
138      <thead>
139       <row>
140        <entry>JSON primitive type</entry>
141        <entry><productname>PostgreSQL</productname> type</entry>
142        <entry>Notes</entry>
143       </row>
144      </thead>
145      <tbody>
146       <row>
147        <entry><type>string</type></entry>
148        <entry><type>text</type></entry>
149        <entry><literal>\u0000</literal> is disallowed, as are Unicode escapes
150         representing characters not available in the database encoding</entry>
151       </row>
152       <row>
153        <entry><type>number</type></entry>
154        <entry><type>numeric</type></entry>
155        <entry><literal>NaN</literal> and <literal>infinity</literal> values are disallowed</entry>
156       </row>
157       <row>
158        <entry><type>boolean</type></entry>
159        <entry><type>boolean</type></entry>
160        <entry>Only lowercase <literal>true</literal> and <literal>false</literal> spellings are accepted</entry>
161       </row>
162       <row>
163        <entry><type>null</type></entry>
164        <entry>(none)</entry>
165        <entry>SQL <literal>NULL</literal> is a different concept</entry>
166       </row>
167      </tbody>
168     </tgroup>
169   </table>
170
171 <sect2 id="json-keys-elements">
172  <title>JSON Input and Output Syntax</title>
173  <para>
174   The input/output syntax for the JSON data types is as specified in
175   <acronym>RFC</acronym> 7159.
176  </para>
177  <para>
178   The following are all valid <type>json</type> (or <type>jsonb</type>) expressions:
179<programlisting>
180-- Simple scalar/primitive value
181-- Primitive values can be numbers, quoted strings, true, false, or null
182SELECT '5'::json;
183
184-- Array of zero or more elements (elements need not be of same type)
185SELECT '[1, 2, "foo", null]'::json;
186
187-- Object containing pairs of keys and values
188-- Note that object keys must always be quoted strings
189SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json;
190
191-- Arrays and objects can be nested arbitrarily
192SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json;
193</programlisting>
194  </para>
195
196  <para>
197   As previously stated, when a JSON value is input and then printed without
198   any additional processing, <type>json</type> outputs the same text that was
199   input, while <type>jsonb</type> does not preserve semantically-insignificant
200   details such as whitespace.  For example, note the differences here:
201<programlisting>
202SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::json;
203                      json
204-------------------------------------------------
205 {"bar": "baz", "balance": 7.77, "active":false}
206(1 row)
207
208SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::jsonb;
209                      jsonb
210--------------------------------------------------
211 {"bar": "baz", "active": false, "balance": 7.77}
212(1 row)
213</programlisting>
214   One semantically-insignificant detail worth noting is that
215   in <type>jsonb</type>, numbers will be printed according to the behavior of the
216   underlying <type>numeric</type> type.  In practice this means that numbers
217   entered with <literal>E</literal> notation will be printed without it, for
218   example:
219<programlisting>
220SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb;
221         json          |          jsonb
222-----------------------+-------------------------
223 {"reading": 1.230e-5} | {"reading": 0.00001230}
224(1 row)
225</programlisting>
226   However, <type>jsonb</type> will preserve trailing fractional zeroes, as seen
227   in this example, even though those are semantically insignificant for
228   purposes such as equality checks.
229  </para>
230
231  <para>
232    For the list of built-in functions and operators available for
233    constructing and processing JSON values, see <xref linkend="functions-json"/>.
234  </para>
235 </sect2>
236
237 <sect2 id="json-doc-design">
238  <title>Designing JSON Documents</title>
239  <para>
240   Representing data as JSON can be considerably more flexible than
241   the traditional relational data model, which is compelling in
242   environments where requirements are fluid.  It is quite possible
243   for both approaches to co-exist and complement each other within
244   the same application.  However, even for applications where maximal
245   flexibility is desired, it is still recommended that JSON documents
246   have a somewhat fixed structure.  The structure is typically
247   unenforced (though enforcing some business rules declaratively is
248   possible), but having a predictable structure makes it easier to write
249   queries that usefully summarize a set of <quote>documents</quote> (datums)
250   in a table.
251  </para>
252  <para>
253   JSON data is subject to the same concurrency-control
254   considerations as any other data type when stored in a table.
255   Although storing large documents is practicable, keep in mind that
256   any update acquires a row-level lock on the whole row.
257   Consider limiting JSON documents to a
258   manageable size in order to decrease lock contention among updating
259   transactions.  Ideally, JSON documents should each
260   represent an atomic datum that business rules dictate cannot
261   reasonably be further subdivided into smaller datums that
262   could be modified independently.
263  </para>
264 </sect2>
265
266 <sect2 id="json-containment">
267  <title><type>jsonb</type> Containment and Existence</title>
268  <indexterm>
269    <primary>jsonb</primary>
270    <secondary>containment</secondary>
271  </indexterm>
272  <indexterm>
273    <primary>jsonb</primary>
274    <secondary>existence</secondary>
275  </indexterm>
276  <para>
277    Testing <firstterm>containment</firstterm> is an important capability of
278    <type>jsonb</type>.  There is no parallel set of facilities for the
279    <type>json</type> type.  Containment tests whether
280    one <type>jsonb</type> document has contained within it another one.
281    These examples return true except as noted:
282  </para>
283<programlisting>
284-- Simple scalar/primitive values contain only the identical value:
285SELECT '"foo"'::jsonb @&gt; '"foo"'::jsonb;
286
287-- The array on the right side is contained within the one on the left:
288SELECT '[1, 2, 3]'::jsonb @&gt; '[1, 3]'::jsonb;
289
290-- Order of array elements is not significant, so this is also true:
291SELECT '[1, 2, 3]'::jsonb @&gt; '[3, 1]'::jsonb;
292
293-- Duplicate array elements don't matter either:
294SELECT '[1, 2, 3]'::jsonb @&gt; '[1, 2, 2]'::jsonb;
295
296-- The object with a single pair on the right side is contained
297-- within the object on the left side:
298SELECT '{"product": "PostgreSQL", "version": 9.4, "jsonb": true}'::jsonb @&gt; '{"version": 9.4}'::jsonb;
299
300-- The array on the right side is <emphasis>not</emphasis> considered contained within the
301-- array on the left, even though a similar array is nested within it:
302SELECT '[1, 2, [1, 3]]'::jsonb @&gt; '[1, 3]'::jsonb;  -- yields false
303
304-- But with a layer of nesting, it is contained:
305SELECT '[1, 2, [1, 3]]'::jsonb @&gt; '[[1, 3]]'::jsonb;
306
307-- Similarly, containment is not reported here:
308SELECT '{"foo": {"bar": "baz"}}'::jsonb @&gt; '{"bar": "baz"}'::jsonb;  -- yields false
309
310-- A top-level key and an empty object is contained:
311SELECT '{"foo": {"bar": "baz"}}'::jsonb @&gt; '{"foo": {}}'::jsonb;
312</programlisting>
313
314  <para>
315   The general principle is that the contained object must match the
316   containing object as to structure and data contents, possibly after
317   discarding some non-matching array elements or object key/value pairs
318   from the containing object.
319   But remember that the order of array elements is not significant when
320   doing a containment match, and duplicate array elements are effectively
321   considered only once.
322  </para>
323
324  <para>
325   As a special exception to the general principle that the structures
326   must match, an array may contain a primitive value:
327  </para>
328<programlisting>
329-- This array contains the primitive string value:
330SELECT '["foo", "bar"]'::jsonb @&gt; '"bar"'::jsonb;
331
332-- This exception is not reciprocal -- non-containment is reported here:
333SELECT '"bar"'::jsonb @&gt; '["bar"]'::jsonb;  -- yields false
334</programlisting>
335
336  <para>
337    <type>jsonb</type> also has an <firstterm>existence</firstterm> operator, which is
338    a variation on the theme of containment: it tests whether a string
339    (given as a <type>text</type> value) appears as an object key or array
340    element at the top level of the <type>jsonb</type> value.
341    These examples return true except as noted:
342  </para>
343<programlisting>
344-- String exists as array element:
345SELECT '["foo", "bar", "baz"]'::jsonb ? 'bar';
346
347-- String exists as object key:
348SELECT '{"foo": "bar"}'::jsonb ? 'foo';
349
350-- Object values are not considered:
351SELECT '{"foo": "bar"}'::jsonb ? 'bar';  -- yields false
352
353-- As with containment, existence must match at the top level:
354SELECT '{"foo": {"bar": "baz"}}'::jsonb ? 'bar'; -- yields false
355
356-- A string is considered to exist if it matches a primitive JSON string:
357SELECT '"foo"'::jsonb ? 'foo';
358</programlisting>
359
360  <para>
361    JSON objects are better suited than arrays for testing containment or
362    existence when there are many keys or elements involved, because
363    unlike arrays they are internally optimized for searching, and do not
364    need to be searched linearly.
365  </para>
366
367  <tip>
368   <para>
369    Because JSON containment is nested, an appropriate query can skip
370    explicit selection of sub-objects.  As an example, suppose that we have
371    a <structfield>doc</structfield> column containing objects at the top level, with
372    most objects containing <literal>tags</literal> fields that contain arrays of
373    sub-objects.  This query finds entries in which sub-objects containing
374    both <literal>"term":"paris"</literal> and <literal>"term":"food"</literal> appear,
375    while ignoring any such keys outside the <literal>tags</literal> array:
376<programlisting>
377SELECT doc-&gt;'site_name' FROM websites
378  WHERE doc @&gt; '{"tags":[{"term":"paris"}, {"term":"food"}]}';
379</programlisting>
380    One could accomplish the same thing with, say,
381<programlisting>
382SELECT doc-&gt;'site_name' FROM websites
383  WHERE doc-&gt;'tags' @&gt; '[{"term":"paris"}, {"term":"food"}]';
384</programlisting>
385    but that approach is less flexible, and often less efficient as well.
386   </para>
387
388   <para>
389    On the other hand, the JSON existence operator is not nested: it will
390    only look for the specified key or array element at top level of the
391    JSON value.
392   </para>
393  </tip>
394
395  <para>
396    The various containment and existence operators, along with all other
397    JSON operators and functions are documented
398    in <xref linkend="functions-json"/>.
399  </para>
400 </sect2>
401
402 <sect2 id="json-indexing">
403  <title><type>jsonb</type> Indexing</title>
404  <indexterm>
405    <primary>jsonb</primary>
406    <secondary>indexes on</secondary>
407  </indexterm>
408
409  <para>
410    GIN indexes can be used to efficiently search for
411    keys or key/value pairs occurring within a large number of
412    <type>jsonb</type> documents (datums).
413    Two GIN <quote>operator classes</quote> are provided, offering different
414    performance and flexibility trade-offs.
415  </para>
416  <para>
417    The default GIN operator class for <type>jsonb</type> supports queries with
418    top-level key-exists operators <literal>?</literal>, <literal>?&amp;</literal>
419    and <literal>?|</literal> operators and path/value-exists operator
420    <literal>@&gt;</literal>.
421    (For details of the semantics that these operators
422    implement, see <xref linkend="functions-jsonb-op-table"/>.)
423    An example of creating an index with this operator class is:
424<programlisting>
425CREATE INDEX idxgin ON api USING GIN (jdoc);
426</programlisting>
427    The non-default GIN operator class <literal>jsonb_path_ops</literal>
428    supports indexing the <literal>@&gt;</literal> operator only.
429    An example of creating an index with this operator class is:
430<programlisting>
431CREATE INDEX idxginp ON api USING GIN (jdoc jsonb_path_ops);
432</programlisting>
433  </para>
434
435  <para>
436    Consider the example of a table that stores JSON documents
437    retrieved from a third-party web service, with a documented schema
438    definition.  A typical document is:
439<programlisting>
440{
441    "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
442    "name": "Angela Barton",
443    "is_active": true,
444    "company": "Magnafone",
445    "address": "178 Howard Place, Gulf, Washington, 702",
446    "registered": "2009-11-07T08:53:22 +08:00",
447    "latitude": 19.793713,
448    "longitude": 86.513373,
449    "tags": [
450        "enim",
451        "aliquip",
452        "qui"
453    ]
454}
455</programlisting>
456    We store these documents in a table named <structname>api</structname>,
457    in a <type>jsonb</type> column named <structfield>jdoc</structfield>.
458    If a GIN index is created on this column,
459    queries like the following can make use of the index:
460<programlisting>
461-- Find documents in which the key "company" has value "Magnafone"
462SELECT jdoc-&gt;'guid', jdoc-&gt;'name' FROM api WHERE jdoc @&gt; '{"company": "Magnafone"}';
463</programlisting>
464    However, the index could not be used for queries like the
465    following, because though the operator <literal>?</literal> is indexable,
466    it is not applied directly to the indexed column <structfield>jdoc</structfield>:
467<programlisting>
468-- Find documents in which the key "tags" contains key or array element "qui"
469SELECT jdoc-&gt;'guid', jdoc-&gt;'name' FROM api WHERE jdoc -&gt; 'tags' ? 'qui';
470</programlisting>
471    Still, with appropriate use of expression indexes, the above
472    query can use an index.  If querying for particular items within
473    the <literal>"tags"</literal> key is common, defining an index like this
474    may be worthwhile:
475<programlisting>
476CREATE INDEX idxgintags ON api USING GIN ((jdoc -&gt; 'tags'));
477</programlisting>
478    Now, the <literal>WHERE</literal> clause <literal>jdoc -&gt; 'tags' ? 'qui'</literal>
479    will be recognized as an application of the indexable
480    operator <literal>?</literal> to the indexed
481    expression <literal>jdoc -&gt; 'tags'</literal>.
482    (More information on expression indexes can be found in <xref
483    linkend="indexes-expressional"/>.)
484  </para>
485  <para>
486    Also, GIN index supports <literal>@@</literal> and <literal>@?</literal>
487    operators, which perform <literal>jsonpath</literal> matching.
488<programlisting>
489SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
490</programlisting>
491<programlisting>
492SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @? '$.tags[*] ? (@ == "qui")';
493</programlisting>
494    GIN index extracts statements of following form out of
495    <literal>jsonpath</literal>: <replaceable>accessors_chain</replaceable> = <replaceable>const</replaceable>.
496    Accessors chain may consist of <literal>.key</literal>,
497    <literal>[*]</literal>, and <literal>[<replaceable>index</replaceable>]</literal> accessors.
498    <literal>jsonb_ops</literal> additionally supports <literal>.*</literal>
499    and <literal>.**</literal> accessors.
500  </para>
501  <para>
502    Another approach to querying is to exploit containment, for example:
503<programlisting>
504-- Find documents in which the key "tags" contains array element "qui"
505SELECT jdoc-&gt;'guid', jdoc-&gt;'name' FROM api WHERE jdoc @&gt; '{"tags": ["qui"]}';
506</programlisting>
507    A simple GIN index on the <structfield>jdoc</structfield> column can support this
508    query.  But note that such an index will store copies of every key and
509    value in the <structfield>jdoc</structfield> column, whereas the expression index
510    of the previous example stores only data found under
511    the <literal>tags</literal> key.  While the simple-index approach is far more
512    flexible (since it supports queries about any key), targeted expression
513    indexes are likely to be smaller and faster to search than a simple
514    index.
515  </para>
516
517  <para>
518    Although the <literal>jsonb_path_ops</literal> operator class supports
519    only queries with the <literal>@&gt;</literal>, <literal>@@</literal>
520    and <literal>@?</literal> operators, it has notable
521    performance advantages over the default operator
522    class <literal>jsonb_ops</literal>.  A <literal>jsonb_path_ops</literal>
523    index is usually much smaller than a <literal>jsonb_ops</literal>
524    index over the same data, and the specificity of searches is better,
525    particularly when queries contain keys that appear frequently in the
526    data.  Therefore search operations typically perform better
527    than with the default operator class.
528  </para>
529
530  <para>
531    The technical difference between a <literal>jsonb_ops</literal>
532    and a <literal>jsonb_path_ops</literal> GIN index is that the former
533    creates independent index items for each key and value in the data,
534    while the latter creates index items only for each value in the
535    data.
536    <footnote>
537     <para>
538      For this purpose, the term <quote>value</quote> includes array elements,
539      though JSON terminology sometimes considers array elements distinct
540      from values within objects.
541     </para>
542    </footnote>
543    Basically, each <literal>jsonb_path_ops</literal> index item is
544    a hash of the value and the key(s) leading to it; for example to index
545    <literal>{"foo": {"bar": "baz"}}</literal>, a single index item would
546    be created incorporating all three of <literal>foo</literal>, <literal>bar</literal>,
547    and <literal>baz</literal> into the hash value.  Thus a containment query
548    looking for this structure would result in an extremely specific index
549    search; but there is no way at all to find out whether <literal>foo</literal>
550    appears as a key.  On the other hand, a <literal>jsonb_ops</literal>
551    index would create three index items representing <literal>foo</literal>,
552    <literal>bar</literal>, and <literal>baz</literal> separately; then to do the
553    containment query, it would look for rows containing all three of
554    these items.  While GIN indexes can perform such an AND search fairly
555    efficiently, it will still be less specific and slower than the
556    equivalent <literal>jsonb_path_ops</literal> search, especially if
557    there are a very large number of rows containing any single one of the
558    three index items.
559  </para>
560
561  <para>
562    A disadvantage of the <literal>jsonb_path_ops</literal> approach is
563    that it produces no index entries for JSON structures not containing
564    any values, such as <literal>{"a": {}}</literal>.  If a search for
565    documents containing such a structure is requested, it will require a
566    full-index scan, which is quite slow.  <literal>jsonb_path_ops</literal> is
567    therefore ill-suited for applications that often perform such searches.
568  </para>
569
570  <para>
571    <type>jsonb</type> also supports <literal>btree</literal> and <literal>hash</literal>
572    indexes.  These are usually useful only if it's important to check
573    equality of complete JSON documents.
574    The <literal>btree</literal> ordering for <type>jsonb</type> datums is seldom
575    of great interest, but for completeness it is:
576<synopsis>
577<replaceable>Object</replaceable> > <replaceable>Array</replaceable> > <replaceable>Boolean</replaceable> > <replaceable>Number</replaceable> > <replaceable>String</replaceable> > <replaceable>Null</replaceable>
578
579<replaceable>Object with n pairs</replaceable> > <replaceable>object with n - 1 pairs</replaceable>
580
581<replaceable>Array with n elements</replaceable> > <replaceable>array with n - 1 elements</replaceable>
582</synopsis>
583      Objects with equal numbers of pairs are compared in the order:
584<synopsis>
585<replaceable>key-1</replaceable>, <replaceable>value-1</replaceable>, <replaceable>key-2</replaceable> ...
586</synopsis>
587      Note that object keys are compared in their storage order;
588      in particular, since shorter keys are stored before longer keys, this
589      can lead to results that might be unintuitive, such as:
590<programlisting>
591{ "aa": 1, "c": 1} > {"b": 1, "d": 1}
592</programlisting>
593      Similarly, arrays with equal numbers of elements are compared in the
594      order:
595<synopsis>
596<replaceable>element-1</replaceable>, <replaceable>element-2</replaceable> ...
597</synopsis>
598      Primitive JSON values are compared using the same
599      comparison rules as for the underlying
600      <productname>PostgreSQL</productname> data type.  Strings are
601      compared using the default database collation.
602  </para>
603 </sect2>
604
605 <sect2>
606  <title>Transforms</title>
607
608  <para>
609   Additional extensions are available that implement transforms for the
610   <type>jsonb</type> type for different procedural languages.
611  </para>
612
613  <para>
614   The extensions for PL/Perl are called <literal>jsonb_plperl</literal> and
615   <literal>jsonb_plperlu</literal>.  If you use them, <type>jsonb</type>
616   values are mapped to Perl arrays, hashes, and scalars, as appropriate.
617  </para>
618
619  <para>
620   The extensions for PL/Python are called <literal>jsonb_plpythonu</literal>,
621   <literal>jsonb_plpython2u</literal>, and
622   <literal>jsonb_plpython3u</literal> (see <xref
623   linkend="plpython-python23"/> for the PL/Python naming convention).  If you
624   use them, <type>jsonb</type> values are mapped to Python dictionaries,
625   lists, and scalars, as appropriate.
626  </para>
627
628  <para>
629   Of these extensions, <literal>jsonb_plperl</literal> is
630   considered <quote>trusted</quote>, that is, it can be installed by
631   non-superusers who have <literal>CREATE</literal> privilege on the
632   current database.  The rest require superuser privilege to install.
633  </para>
634 </sect2>
635
636 <sect2 id="datatype-jsonpath">
637  <title>jsonpath Type</title>
638
639  <indexterm zone="datatype-jsonpath">
640   <primary>jsonpath</primary>
641  </indexterm>
642
643  <para>
644   The <type>jsonpath</type> type implements support for the SQL/JSON path language
645   in <productname>PostgreSQL</productname> to efficiently query JSON data.
646   It provides a binary representation of the parsed SQL/JSON path
647   expression that specifies the items to be retrieved by the path
648   engine from the JSON data for further processing with the
649   SQL/JSON query functions.
650  </para>
651
652  <para>
653   The semantics of SQL/JSON path predicates and operators generally follow SQL.
654   At the same time, to provide a natural way of working with JSON data,
655   SQL/JSON path syntax uses some JavaScript conventions:
656  </para>
657
658  <itemizedlist>
659   <listitem>
660    <para>
661     Dot (<literal>.</literal>) is used for member access.
662    </para>
663   </listitem>
664   <listitem>
665    <para>
666     Square brackets (<literal>[]</literal>) are used for array access.
667    </para>
668   </listitem>
669   <listitem>
670    <para>
671     SQL/JSON arrays are 0-relative, unlike regular SQL arrays that start from 1.
672    </para>
673   </listitem>
674  </itemizedlist>
675
676  <para>
677   An SQL/JSON path expression is typically written in an SQL query as an
678   SQL character string literal, so it must be enclosed in single quotes,
679   and any single quotes desired within the value must be doubled
680   (see <xref linkend="sql-syntax-strings"/>).
681   Some forms of path expressions require string literals within them.
682   These embedded string literals follow JavaScript/ECMAScript conventions:
683   they must be surrounded by double quotes, and backslash escapes may be
684   used within them to represent otherwise-hard-to-type characters.
685   In particular, the way to write a double quote within an embedded string
686   literal is <literal>\"</literal>, and to write a backslash itself, you
687   must write <literal>\\</literal>.  Other special backslash sequences
688   include those recognized in JSON strings:
689   <literal>\b</literal>,
690   <literal>\f</literal>,
691   <literal>\n</literal>,
692   <literal>\r</literal>,
693   <literal>\t</literal>,
694   <literal>\v</literal>
695   for various ASCII control characters, and
696   <literal>\u<replaceable>NNNN</replaceable></literal> for a Unicode
697   character identified by its 4-hex-digit code point.  The backslash
698   syntax also includes two cases not allowed by JSON:
699   <literal>\x<replaceable>NN</replaceable></literal> for a character code
700   written with only two hex digits, and
701   <literal>\u{<replaceable>N...</replaceable>}</literal> for a character
702   code written with 1 to 6 hex digits.
703  </para>
704
705  <para>
706   A path expression consists of a sequence of path elements,
707   which can be any of the following:
708   <itemizedlist>
709    <listitem>
710     <para>
711      Path literals of JSON primitive types:
712      Unicode text, numeric, true, false, or null.
713     </para>
714    </listitem>
715    <listitem>
716     <para>
717      Path variables listed in <xref linkend="type-jsonpath-variables"/>.
718     </para>
719    </listitem>
720    <listitem>
721     <para>
722      Accessor operators listed in <xref linkend="type-jsonpath-accessors"/>.
723     </para>
724    </listitem>
725    <listitem>
726     <para>
727      <type>jsonpath</type> operators and methods listed
728      in <xref linkend="functions-sqljson-path-operators"/>.
729     </para>
730    </listitem>
731    <listitem>
732     <para>
733      Parentheses, which can be used to provide filter expressions
734      or define the order of path evaluation.
735     </para>
736    </listitem>
737   </itemizedlist>
738  </para>
739
740  <para>
741   For details on using <type>jsonpath</type> expressions with SQL/JSON
742   query functions, see <xref linkend="functions-sqljson-path"/>.
743  </para>
744
745  <table id="type-jsonpath-variables">
746   <title><type>jsonpath</type> Variables</title>
747   <tgroup cols="2">
748    <colspec colname="col1" colwidth="1*"/>
749    <colspec colname="col2" colwidth="2*"/>
750    <thead>
751     <row>
752      <entry>Variable</entry>
753      <entry>Description</entry>
754     </row>
755    </thead>
756    <tbody>
757     <row>
758      <entry><literal>$</literal></entry>
759      <entry>A variable representing the JSON value being queried
760      (the <firstterm>context item</firstterm>).
761      </entry>
762     </row>
763     <row>
764      <entry><literal>$varname</literal></entry>
765      <entry>
766        A named variable. Its value can be set by the parameter
767        <parameter>vars</parameter> of several JSON processing functions;
768        see <xref linkend="functions-json-processing-table"/> for details.
769        <!-- TODO: describe PASSING clause once implemented !-->
770      </entry>
771     </row>
772     <row>
773      <entry><literal>@</literal></entry>
774      <entry>A variable representing the result of path evaluation
775      in filter expressions.
776      </entry>
777     </row>
778    </tbody>
779   </tgroup>
780  </table>
781
782  <table id="type-jsonpath-accessors">
783   <title><type>jsonpath</type> Accessors</title>
784   <tgroup cols="2">
785    <colspec colname="col1" colwidth="1*"/>
786    <colspec colname="col2" colwidth="2*"/>
787    <thead>
788     <row>
789      <entry>Accessor Operator</entry>
790      <entry>Description</entry>
791     </row>
792    </thead>
793    <tbody>
794     <row>
795      <entry>
796       <para>
797        <literal>.<replaceable>key</replaceable></literal>
798       </para>
799       <para>
800        <literal>."$<replaceable>varname</replaceable>"</literal>
801       </para>
802      </entry>
803      <entry>
804       <para>
805        Member accessor that returns an object member with
806        the specified key. If the key name matches some named variable
807        starting with <literal>$</literal> or does not meet the
808        JavaScript rules for an identifier, it must be enclosed in
809        double quotes to make it a string literal.
810       </para>
811      </entry>
812     </row>
813     <row>
814      <entry>
815       <para>
816        <literal>.*</literal>
817       </para>
818      </entry>
819      <entry>
820       <para>
821        Wildcard member accessor that returns the values of all
822        members located at the top level of the current object.
823       </para>
824      </entry>
825     </row>
826     <row>
827      <entry>
828       <para>
829        <literal>.**</literal>
830       </para>
831      </entry>
832      <entry>
833       <para>
834        Recursive wildcard member accessor that processes all levels
835        of the JSON hierarchy of the current object and returns all
836        the member values, regardless of their nesting level. This
837        is a <productname>PostgreSQL</productname> extension of
838        the SQL/JSON standard.
839       </para>
840      </entry>
841     </row>
842     <row>
843      <entry>
844       <para>
845        <literal>.**{<replaceable>level</replaceable>}</literal>
846       </para>
847       <para>
848        <literal>.**{<replaceable>start_level</replaceable> to
849        <replaceable>end_level</replaceable>}</literal>
850       </para>
851      </entry>
852      <entry>
853       <para>
854        Like <literal>.**</literal>, but selects only the specified
855        levels of the JSON hierarchy. Nesting levels are specified as integers.
856        Level zero corresponds to the current object. To access the lowest
857        nesting level, you can use the <literal>last</literal> keyword.
858        This is a <productname>PostgreSQL</productname> extension of
859        the SQL/JSON standard.
860       </para>
861      </entry>
862     </row>
863     <row>
864      <entry>
865       <para>
866        <literal>[<replaceable>subscript</replaceable>, ...]</literal>
867       </para>
868      </entry>
869      <entry>
870       <para>
871        Array element accessor.
872        <literal><replaceable>subscript</replaceable></literal> can be
873        given in two forms: <literal><replaceable>index</replaceable></literal>
874        or <literal><replaceable>start_index</replaceable> to <replaceable>end_index</replaceable></literal>.
875        The first form returns a single array element by its index. The second
876        form returns an array slice by the range of indexes, including the
877        elements that correspond to the provided
878        <replaceable>start_index</replaceable> and <replaceable>end_index</replaceable>.
879       </para>
880       <para>
881        The specified <replaceable>index</replaceable> can be an integer, as
882        well as an expression returning a single numeric value, which is
883        automatically cast to integer. Index zero corresponds to the first
884        array element. You can also use the <literal>last</literal> keyword
885        to denote the last array element, which is useful for handling arrays
886        of unknown length.
887       </para>
888      </entry>
889     </row>
890     <row>
891      <entry>
892       <para>
893        <literal>[*]</literal>
894       </para>
895      </entry>
896      <entry>
897       <para>
898        Wildcard array element accessor that returns all array elements.
899       </para>
900      </entry>
901     </row>
902    </tbody>
903   </tgroup>
904  </table>
905
906 </sect2>
907</sect1>
908