1---
2title: "AST query language"
3layout: default
4canonical: "/puppetdb/latest/api/query/v4/ast.html"
5---
6
7# AST query language
8
9[root]: ./overview.markdown
10[catalogs]: ./catalogs.markdown
11[contact]: ../../../pdb_support_guide.markdown#contact-us
12[edges]: ./edges.markdown
13[environments]: ./environments.markdown
14[events]: ./events.markdown
15[facts]: ./facts.markdown
16[fact-contents]: ./fact-contents.markdown
17[fact-paths]: ./fact-paths.markdown
18[inventory]: ./inventory.markdown
19[nodes]: ./nodes.markdown
20[pg-regex]: https://www.postgresql.org/docs/11/functions-matching.html#FUNCTIONS-POSIX-REGEXP
21[producers]: ./producers.markdown
22[query]: query.markdown
23[reports]: ./reports.markdown
24[resources]: ./resources.markdown
25[entities]: ./entities.markdown
26[pql]: ./pql.markdown
27[urlencode]: http://en.wikipedia.org/wiki/Percent-encoding
28[to-char]: http://www.postgresql.org/docs/9.6/static/functions-formatting.html
29
30## Summary
31
32The AST (abstract syntax tree) query language for PuppetDB is a language that presents
33itself as a raw AST format. It can be used to provide complex querying via REST on each of
34PuppetDB's query [endpoints][entities].
35
36This document outlines the operator syntax for this query language.
37
38An easier to use alternative to this query language is the [Puppet query language][pql], which is
39largely based on the AST query language.
40
41## Query strings
42
43An AST query string passed to the `query` URL parameter of a REST endpoint must be a [URL-encoded][urlencode]
44JSON array, which may contain scalar data types (usually strings) and additional arrays, that describes a
45complex _comparison operation_ in _prefix notation_ with an **operator** first and its **arguments** following.
46
47That is, before being URL-encoded, all AST query strings follow this form:
48
49    [ "<OPERATOR>", "<ARGUMENT>", (..."<ARGUMENT>"...) ]
50
51Different operators may take different numbers (and types) of arguments.
52
53## Binary operators
54
55Each of these operators accepts two arguments: a **field** and a
56**value.** These operators are **non-transitive,** which means that their syntax must always be:
57
58    ["<OPERATOR>", "<FIELD>", "<VALUE>"]
59
60The available fields for each endpoint are listed in that endpoint's documentation.
61
62### `=` (equality)
63
64**Works with:** strings, numbers, timestamps, Booleans, arrays, multi, path.
65
66**Matches if:** the field's actual value is exactly the same as the provided value.
67
68* Most fields are strings.
69* Some fields are Booleans.
70* Arrays match if any **one** of their elements matches.
71* Path matches are a special kind of array, and must be exactly matched with this operator.
72
73### `>` (greater than)
74
75**Works with:** numbers, timestamps, multi.
76
77**Matches if:** the field is greater than the provided value.
78
79### `<` (less than)
80
81**Works with:** numbers, timestamps, multi.
82
83**Matches if:** the field is less than the provided value.
84
85### `>=` (greater than or equal to)
86
87**Works with:** numbers, timestamps, multi.
88
89**Matches if:** the field is greater than or equal to the provided value.
90
91### `<=` (less than or equal to)
92
93**Works with:** numbers, timestamps, multi.
94
95**Matches if:** the field is less than or equal to the provided value.
96
97### `~` (regexp match)
98
99**Works with:** strings, multi.
100
101**Matches if:** the field's actual value matches the provided regular expression. The provided value must be a regular expression represented as a JSON string:
102
103* The regexp **must not** be surrounded by the slash characters (`/rexegp/`) that delimit regexps in many languages.
104* Every backslash character **must** be escaped with an additional backslash. Thus, a sequence like `\d` would be represented as `\\d`, and a literal backslash (represented in a regexp as a double-backslash `\\`) would be represented as a quadruple-backslash (`\\\\`).
105
106The following example would match if the `certname` field's actual value resembled something like `www03.example.com`:
107
108    ["~", "certname", "www\\d+\\.example\\.com"]
109
110> **Note:** Regular expression matching is performed by the database
111> backend, so the available [regexp features](#pg-regex) are
112> determined by PostgreSQL. For best results, use the simplest and
113> most common features that can accomplish your task.
114
115### `~>` (regexp array match)
116
117**Works with:** paths.
118
119**Matches if:** the array matches using the regular expressions provided within in each element. Array indexes are coerced to strings.
120
121The following example would match any network interface names starting with "eth":
122
123    ["~>", "path", ["networking", "eth.*", "macaddress"]]
124
125If you want to match any index for an array path element, you can use regular expressions, as the element acts like a string:
126
127    ["~>", "path", [<array_fact>, ".*"]]
128
129### `null?` (is null)
130
131**Works with:** fields that may be null.
132
133**Matches if:** the field's value is null (when second argument is `true`) or the field is **not** null, or has a real value (when second argument is `false`).
134
135The following example would return events that do not have an associated line number:
136
137    ["null?", "line", true]
138
139Similarly, the below query would return events that do have a specified line number:
140
141    ["null?", "line", false]
142
143## Boolean operators
144
145Every argument of these operators should be a **complete query string** in its own right. These operators are **transitive:** the order of their arguments does not matter.
146
147### `and`
148
149**Matches if:** **all** of its arguments would match. Accepts any number of query strings as its arguments.
150
151### `or`
152
153**Matches if:** **at least one** of its arguments would match. Accepts any number of query strings as its arguments.
154
155### `not`
156
157**Matches if:** its argument **would not** match. Accepts a **single** query string as its argument.
158
159## Projection operators
160
161### `extract`
162
163To reduce the keypairs returned for each result in the response, you can use **extract**:
164
165    ["extract", ["hash", "certname", "transaction_uuid"],
166      ["=", "certname", "foo.com"]]
167
168When only extracting a single column, the `[]` are optional:
169
170    ["extract", "transaction_uuid",
171      ["=", "certname", "foo.com"]]
172
173When applying an aggregate function over a `group_by` clause, an extract
174statement takes the form:
175
176    ["extract", [["function", "count"], "status"],
177      ["=", "certname", "foo.com"],
178      ["group_by", "status"]]
179
180Extract can also be used with a standalone function application:
181
182    ["extract", [["function", "count"]], ["~", "certname", ".\*.com"]]
183
184or
185
186    ["extract", [["function", "count"]]]
187
188#### Extracting a subtree
189
190The JSON fields that support dot notation for hash descendance also support
191dot notation for extracting a subtree. See the Dot notation section below
192for more information.
193
194    ["extract", ["facts.os.family"]]
195
196### `function`
197
198The **function** operator is used to call a function on the result of a
199subquery. Supported functions are described below.
200
201#### `avg`, `sum`, `min`, `max`
202These functions operate on any numeric column and they take the column
203name as an argument, as in the examples above.
204
205#### `count`
206The `count` function can be used with or without a column. When no column is
207supplied, it will return the number of results in the associated subquery.
208Using the function with a column will return the number of results where the
209specified column is not null.
210
211#### `to_string`
212The `to_string` function operates on timestamps and integers, allowing them to
213be formatted in a user-defined manner before being returned from puppetdb.
214Available  formats are the same as those documented for [PostgreSQL's `to_char`
215function][to-char]. For instance, to get the full lower case month name of the
216`producer_timestamp`,  you can query the reports endpoint with:
217
218```
219["extract", [["function", "to_string", "producer_timestamp", "month"]]]
220```
221
222To get the last 2 digits of the year a report was submitted  from the Puppet Server:
223
224```
225["extract", [["function", "to_string", "producer_timestamp", "YY"]]]]
226```
227
228To get the uptime_seconds fact's value as a string, the following query can be used on
229facts or fact-contents endpoint:
230
231```
232["extract", [["function", "to_string", "value", "999999999"]], ["=","name", "uptime_seconds"]]
233```
234
235Please note that in order for `to_string` function to work with integer values, a mask
236must be provided. For more information about masks and how to provide them, please read
237the documentation for [PostgreSQL's `to_char`function][to-char].
238
239### `group_by`
240
241The **group_by** operator must be applied as the last argument of an extract,
242and takes one or more column names as arguments. For instance, to get event
243status counts for active certname by status, you can query the events endpoint
244with:
245
246    ["extract", [["function", "count"], "status", "certname"],
247      ["group_by", "status", "certname"]]
248
249To get the average uptime for your nodes:
250
251    ["extract", [["function", "avg", "value"]], ["=", "name", "uptime_seconds"]]
252
253## Dot notation
254
255*Note*: Dot notation for hash descendence is under development. Currently it has
256full support on the `facts` and `trusted` response keys of the `inventory`
257endpoint, and partial support on the `parameters` column of the resources
258endpoint. It may be expanded to other endpoints in the future based on demand.
259
260Certain types of JSON data returned by PuppetDB can be queried in a structured
261way using `dot notation`. The rules for dot notation are:
262* Hash descendence is represented by a period-separated sequence of key names
263* Array indexing (`inventory` only) is represented with brackets (`[]`) on the
264end of a key.
265* Regular expression matching ([`inventory`](#inventory) only) is
266  represented with the `match` operator, but note that [`match` in its
267  current form has been deprecated](#dotted-field-syntax), and is
268  likely to be removed or altered in a backward-incompatible way in a
269  future release.
270
271For example, given the inventory response
272
273
274    {
275        "certname" : "mbp.local",
276        "timestamp" : "2016-07-11T20:02:33.190Z",
277        "environment" : "production",
278        "facts" : {
279            "kernel" : "Darwin",
280            "operatingsystem" : "Darwin",
281            "macaddress_p2p0" : "0e:15:c2:d6:f8:4e",
282            "system_uptime" : {
283                "days" : 0,
284                "hours" : 1,
285                "uptime" : "1:52 hours",
286                "seconds" : 6733
287            },
288            "macaddress_awdl0" : "6e:31:ef:e6:36:54",
289            "processors": {
290                "models": [
291                    "Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz",
292                    "Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz",
293                    "Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz",
294                    "Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz"],
295                "count": 4,
296                "physicalcount": 1
297            },
298            ...
299        },
300        "trusted" : {
301            "domain" : "local",
302            "certname" : "mbp.local",
303            "hostname" : "mbp",
304            "extensions" : { },
305            "authenticated" : "remote"
306        }
307    }
308
309valid queries would include
310
311* `["=", "facts.kernel", "Darwin"]`
312
313* `["=", "facts.system_uptime.days", 0]`
314
315* `[">", "facts.system_uptime.hours", 0]`
316
317* `["~", "facts.processors.models[0]", "Intel.*"]`
318
319### Dotted Projections
320
321Dot notation is also supported for extracting a subtree of JSON fields.
322For example you can query the inventory endpoint with
323
324    ["extract", ["trusted.certname", "facts.system_uptime"]]
325
326To get a response with only the elements you've asked for
327
328    {
329        "trusted.certname": "mbp.local",
330        "facts.system_uptime.uptime": {
331            "days" : 0,
332            "hours" : 1,
333            "uptime" : "1:52 hours",
334            "seconds" : 6733
335        }
336    }
337
338### Dotted field syntax
339
340A dotted field, which repseents a path into a JSON tree is made up of
341components separated by dots (`.`), for example `facts.kernel`. Any
342path component can be double-quoted, for example `facts."x.y".z`, in
343which case the name will include all of the characters after the first
344double-quote, and before the next double-quote that is itself not
345preceded by a backslash and is followed by either a dot, or the end of
346the field. So the previous example `facts."x.y".z` represents the
347three components, `facts`, `x.y`, and `z`. In AST queries, any
348double-quotes will have to be properly JSON escaped. So in an
349`extract` the path `x."y.z"` becomes `[extract "x.\"y.z\"", ...]`.
350
351There is currently no way to represent a field component that contains
352a dot and ends in a backslash. For example, a fact named `x.y\` must be
353quoted, given the dot, but as just mentioned, quoted fields cannot end
354in a backslash.
355
356> **Note:** the `match()` operator described here is deprecated and is
357> likely to be retired or altered in a backward-incompatible way in a
358> future release.
359
360In some cases (e.g. [inventory endpoint](#inventory)) dotted fields
361can also contain a `match()` component, for example
362`facts.partitions.match("sd.*")` The match pattern must be a
363[PostgreSQL regular expression](#pg-regex), and must begin with
364`match`, open paren, double quote, and it will end at the next double
365quote, close paren that is not preceded by a backslash and is followed
366by either a dot, or the end of the field.  The regex then, has
367essentially the same syntax as a double quoted field. And similarly,
368there is currently no way to specify a match regular expression that
369ends in a backslash.
370
371With the current implementation, the `match()` component's behavior is
372not well defined, likley to be surprising, and likely to change in the
373future, so we recommend avoiding it for now, but please do
374[contact us](#contact-us) if you are currently using it, or would like
375to use an operator with better semantics, so we can incorporate that
376information into future plans.
377
378As an example of the potentially surprising behavior, the appearance
379of any `match()` operator in a dotted field can cause the entire
380field, not just the `match()` segment, to be handled as a regular
381expression in an awkward manner.
382
383## Context operators
384
385*Note:* Setting the context at the top of the query is only supported on the
386[root][root] endpoint.
387
388Setting context in a query allows you to choose the entity you are querying
389on. This augments the endpoint support we have today, whereby the endpoint
390decides the context. For example, `/pdb/query/v4/nodes` sets the context of the query
391to `nodes`.
392
393### `from`
394
395The `from` operator allows you to choose the [entity][entities] that you want to query and
396provide optional query and paging clauses to filter those results. This operator can
397be used at the top-level context of a query:
398
399    ["from", "nodes", ["=", "certname", "myserver"]]
400
401The `from` operator can also be used in a subquery for setting the context when
402using the [`in` operator](#subquery-operators).
403
404When querying a particular endpoint, such as `/pdb/query/v4/nodes`, the endpoint provides
405the context for the query. Querying the [root] endpoint requires specifying a
406context explicitly.
407
408## Paging operators (`limit`, `offset`, `order_by`)
409
410PuppetDB allows specification of paging clauses within a "from" clause in a
411query or subquery. The `limit` and `offset` operators both accept an
412integer-valued argument, and `order_by` accepts a vector of either column names
413or vector pairs containing a column name and an ordering of "asc" or "desc".
414For example,
415
416    ["limit", 1]
417
418    ["offset", 1]
419
420    ["order_by", ["certname"]]
421
422    ["order_by", ["certname", ["producer_timestamp", "desc"]]]
423
424When no ordering is explicitly specified, as in the case of "certname" in the
425example above, ascending order is assumed. Here are a few examples of queries
426using paging operators:
427
428Return the most recent ten reports for a certname:
429
430    ["from", "reports",
431      ["=", "certname", "myserver"],
432      ["order_by", [["producer_timestamp", "desc"]]],
433      ["limit", 10]]
434
435Return the next page of ten reports:
436
437    ["from", "reports",
438      ["=", "certname", "myserver"],
439      ["order_by", [["receive_time", "desc"]]],
440      ["limit", 10],
441      ["offset", 10]]
442
443Return the most recent ten reports for any certname:
444
445    ["from", "reports",
446      ["order_by", [["producer_timestamp", "desc"]]],
447      ["limit", 10]]
448
449Return the nodes represented in the ten most recent reports:
450
451    ["from", "nodes",
452      ["in", "certname",
453        ["from", "reports",
454          ["extract", "certname"],
455          ["limit", 10],
456          ["order_by", [["certname", "desc"]]]]]]
457
458The order in which paging operators are supplied does not matter.
459
460## Subquery operators
461
462Subqueries allow you to correlate data from multiple sources or multiple
463rows. For instance, a query such as "fetch the IP addresses of all nodes with
464`Class[Apache]`" would have to use both facts and resources to return a list of facts.
465
466There are two forms of subqueries, implicit and explicit, and both forms work the
467same under the hood. Note, however, that the implicit form only requires you to specify the related entity, while the explicit form requires you to be specify exactly how
468data should be joined during the subquery.
469
470### `subquery` (implicit subqueries)
471
472Implicit queries work like most operators, and simply require you to specify the
473related entity and the query to use:
474
475    ["subquery", "<ENTITY>", <SUBQUERY STATEMENT>]
476
477The [`<ENTITY>`][entities] is the particular entity you are subquerying on, however not
478all entities are implicitly relatable to all other entities, as not every relationship makes sense.
479Consult the documentation for the chosen [`<ENTITY>`][entities] for details on what
480implicit relationships are supported.
481
482In PuppetDB, we keep a map of how different entities relate to each
483other, and therefore no data beyond the entity is needed in this case. This is
484different from explicit subqueries, where you must specify how
485two entities are related. Implicit subqueries can be used to join any two
486entities that have a `certname` field. Additional relationships are described
487in the endpoint-specific documentation as applicable.
488
489#### Implicit subquery examples
490
491A query string like the following on the [`nodes`][nodes] endpoint will return the list
492of all nodes with the `Package[Tomcat]` resource in their catalog, and a certname starting
493with `web1`:
494
495    ["and",
496      ["~", "certname", "^web1"],
497      ["subquery", "resources",
498        ["and",
499          ["=", "type", "Package"],
500          ["=", "title", "Tomcat"]]]]
501
502If you want to display the entire `networking` fact, and the host's interface uses a certain mac address,
503you can do the following on the [`facts`][facts] endpoint:
504
505    ["and",
506      ["=", "name", "networking"],
507      ["subquery", "fact_contents",
508        ["and",
509          ["~>", "path", ["networking", ".*", "macaddress", ".*"]],
510          ["=", "value", "aa:bb:cc:dd:ee:00"]]]]
511
512### Explicit subqueries
513
514While implicit subqueries can make your syntax succinct, not all relationships are
515mapped internally. For these more advanced subqueries, you need to specify exactly the fields that
516a subquery should join on. This is where an explicit subquery can be useful.
517
518Explicit subqueries are unlike the other operators listed above. They always appear
519together in one of the following forms:
520
521    ["in", ["<FIELDS>"], ["extract", ["<FIELDS>"], <SUBQUERY STATEMENT>] ]
522
523The second new methodology uses `from` to set the context, and now looks like this:
524
525    ["in", ["<FIELDS>"], ["from", <ENTITY>, ["extract", ["<FIELDS>"], <SUBQUERY>] ] ]
526
527That is:
528
529* The `in` operator results in a complete query string. The `extract` operator and the subqueries do not.
530* An `in` statement **must** contain one or more fields and an `extract` statement.
531* An `extract` statement **must** contain one or more fields and a subquery statement.
532
533These statements work together as follows (working "outward" and starting with the subquery):
534
535* The subquery collects a group of PuppetDB objects (specifically, a group of [resources][resources], [facts][facts], [fact-contents][fact-contents], or [nodes][nodes]). Each of these objects has many **fields.**
536* The `extract` statement collects the value of one or more **fields** across every object returned by the subquery.
537* The `in` statement **matches** if its field values are present in the list returned by the `extract` statement.
538
539Subquery | Extract | In
540---------|---------|---
541Every resource whose type is "Class" and title is "Apache." (Note that all resource objects have a `certname` field, among other fields.) | Every `certname` field from the results of the subquery. | Match if the `certname` field is present in the list from the `extract` statement.
542
543The complete `in` statement described in the table above would match any object that shares a `certname` with a node that has `Class[Apache]`. This could be combined with a Boolean operator to get a specific fact from every node that matches the `in` statement.
544
545#### `in`
546
547An `in` statement constitutes a full query string, which can be used alone or as an argument for a [Boolean operator](#boolean-operators).
548
549"In" statements are **non-transitive** and take two arguments:
550
551* The first argument **must** consist of one or more **fields** for the endpoint
552  or entity **being queried.**. This is a string or vector of strings.
553* The second argument **must** be either:
554  * an **`extract` statement,** which acts as a list of fields to extract during
555   the subquery for matching against the **fields** in the `in` clause.
556  * a **`from` statement,** which sets the context, and allows for an extract
557   statement to be provided.
558  * an **`array` statement,** which acts as a list of values to match against the
559   **field** in the `in` clause.
560
561**Matches if:** the field values are included in the list of values created by the `extract` or `from` statement.
562
563##### `array`
564
565An `in` statement also accepts an `array` statement as a second argument.
566
567"Array" statements take a single vector argument of values to match the first
568argument of `in` against.
569
570The following query filters for the nodes, `foo.local`, `bar.local`, and
571`baz.local`:
572
573    ["in", "certname",
574     ["array",
575      ["foo.local",
576       "bar.local",
577       "baz.local"]]]
578
579which is equivalent to the following query:
580
581    ["or",
582     ["=","certname","foo.local"],
583     ["=","certname","bar.local"],
584     ["=","certname","baz.local"]]
585
586The `in`-`array` operators support much of the same syntax as the `=` operator.
587For example, the following query on the `/nodes` endpoint is valid:
588
589    ["in", ["fact", "uptime_seconds"],
590     ["array",
591      [20000.0,
592       150.0,
593       30000.0]]]
594
595#### `from`
596
597This statement works like the top-level [`from`](#context-operators) operator,
598and expects an [entity][entities] as the first argument and an optional query in
599the second argument. However, when used within an `in` clause, an `extract`
600statement is expected to choose the fields:
601
602    ["in", "certname",
603     ["from", "facts",
604      ["extract", "certname",
605       [<QUERY>]]]]
606
607#### `extract`
608
609"Extract" statements are **non-transitive** and take two arguments:
610
611* The first argument **must** be a valid set of **fields** for the endpoint
612  **being subqueried** (see second argument). This is a string or vector of
613  strings.
614* The second argument:
615** **must** contain a **subquery statement**
616** or when used with the new `from` operator, **may** contain an optional query.
617
618As the second argument of an `in` statement, an `extract` statement acts as a
619list of possible values. This list is compiled by extracting the value of the
620requested field from every result of the subquery.
621
622#### `select_<ENTITY>` subquery statements
623
624A subquery statement **does not** constitute a full query string. It may only be used as the second argument of an `extract` statement.
625
626Subquery statements are **non-transitive** and take two arguments:
627
628* The first argument **must** be the **name** of one of the available subqueries (listed below).
629* The second argument **must** be a **full query string** that makes sense for the endpoint being subqueried.
630
631As the second argument of an `extract` statement, a subquery statement acts as a collection of PuppetDB objects. Each of the objects returned by the subquery has many fields; the `extract` statement takes the value of one field from each of those objects, and passes that list of values to the `in` statement that contains it.
632
633Each subquery acts as a normal query to one of the PuppetDB endpoints. For info on constructing useful queries, see the docs page for the endpoint matching the subquery:
634
635* [`select_catalogs`][catalogs]
636* [`select_edges`][edges]
637* [`select_environments`][environments]
638* [`select_events`][events]
639* [`select_facts`][facts]
640* [`select_fact_contents`][fact-contents]
641* [`select_fact_paths`][fact-paths]
642* [`select_nodes`][nodes]
643* [`select_producers`][producers]
644* [`select_reports`][reports]
645* [`select_resources`][resources]
646
647#### Explicit subquery examples
648
649This query string queries the `/facts` endpoint for the IP address of
650all nodes with `Class[Apache]`:
651
652    ["and",
653      ["=", "name", "ipaddress"],
654      ["in", "certname",
655        ["extract", "certname",
656          ["select_resources",
657            ["and",
658              ["=", "type", "Class"],
659              ["=", "title", "Apache"]]]]]]
660
661This query string queries the `/nodes` endpoint for all nodes with `Class[Apache]`:
662
663    ["in", "certname",
664      ["extract", "certname",
665        ["select_resources",
666          ["and",
667            ["=", "type", "Class"],
668            ["=", "title", "Apache"]]]]]
669
670This query string queries the `/facts` endpoint for the IP address of
671all Debian nodes.
672
673    ["and",
674      ["=", "name", "ipaddress"],
675      ["in", "certname",
676        ["extract", "certname",
677          ["select_facts",
678            ["and",
679              ["=", "name", "operatingsystem"],
680              ["=", "value", "Debian"]]]]]]
681
682This query string queries the `/facts` endpoint for uptime_hours of all nodes with
683facts_environment `production`:
684
685    ["and",
686      ["=", "name", "uptime_hours"],
687      ["in", "certname",
688        ["extract", "certname",
689          ["select_nodes",
690            ["=", "facts_environment", "production"]]]]]
691
692To find node information for a host that has a macaddress of `aa:bb:cc:dd:ee:00` as
693its first macaddress on the interface `eth0`, you could use this query on '/nodes':
694
695    ["in", "certname",
696      ["extract", "certname",
697        ["select_fact_contents",
698          ["and",
699            ["=", "path", ["networking", "eth0", "macaddress", 0]],
700            ["=", "value", "aa:bb:cc:dd:ee:00"]]]]]
701
702To exhibit a subquery using multiple fields, you could use the following
703on '/facts' to list all top-level facts containing fact contents with paths
704starting with "up" and value less than 100:
705
706    ["in", ["certname", "name"],
707      ["extract", ["certname", "name"],
708        ["select_fact_contents",
709          ["and",
710            ["~>", "path", ["up.*"]],
711            ["<", "value", 100]]]]]
712
713Queries are restricted to active nodes by default; to make this explicit, the
714special "node_state" field may be queried using the values "active", "inactive",
715or "any". For example, to list all catalogs from inactive nodes, use this on the
716/catalogs endpoint:
717
718    ["=", "node_state", "inactive"]
719
720This expands internally into comparisons against each node's deactivation and
721expiration time; a node is consider inactive if either field is set.
722
723#### Explicit subquery examples (with the `from` operator)
724
725Additions to the query language in support of PQL introduced new ways to
726express subqueries using the `from` operator. For example, a query such as this:
727
728    ["and",
729      ["=", "name", "ipaddress"],
730      ["in", "certname",
731        ["extract", "certname",
732          ["select_resources",
733            ["and",
734              ["=", "type", "Class"],
735              ["=", "title", "Apache"]]]]]]
736
737will now look like this:
738
739    ["and",
740      ["=", "name", "ipaddress"],
741      ["in", "certname",
742        ["from", "resources",
743          ["extract", "certname",
744            ["and",
745              ["=", "type", "Class"],
746              ["=", "title", "Apache"]]]]]]
747
748Executing this query on the `/facts` endpoint would filter for `uptime_hours` for all nodes with
749`facts_environment` set to `production`:
750
751    ["and",
752      ["=", "name", "uptime_hours"],
753      ["in", "certname",
754        ["from", "nodes",
755          ["extract", "certname",
756            ["=", "facts_environment", "production"]]]]]
757
758To find node information for a host that has a macaddress of `aa:bb:cc:dd:ee:00` as
759its first macaddress on the interface `eth0`, you could use this query on `/nodes`:
760
761    ["in", "certname",
762      ["from", "fact_contents",
763        ["extract", "certname",
764          ["and",
765            ["=", "path", ["networking", "eth0", "macaddress", 0]],
766            ["=", "value", "aa:bb:cc:dd:ee:00"]]]]]
767