1# Search Query Syntax
2
3We support a simple syntax for complex queries with the following rules:
4
5* Multi-word phrases simply a list of tokens, e.g. `foo bar baz`, and imply intersection (AND) of the terms.
6* Exact phrases are wrapped in quotes, e.g `"hello world"`.
7* OR Unions (i.e `word1 OR word2`), are expressed with a pipe (`|`), e.g. `hello|hallo|shalom|hola`.
8* NOT negation (i.e. `word1 NOT word2`) of expressions or sub-queries. e.g. `hello -world`. As of version 0.19.3, purely negative queries (i.e. `-foo` or `-@title:(foo|bar)`) are supported.
9* Prefix matches (all terms starting with a prefix) are expressed with a `*`. For performance reasons, a minimum prefix length is enforced (2 by default, but is configurable)
10* A special "wildcard query" that returns all results in the index - `*` (cannot be combined with anything else).
11* Selection of specific fields using the syntax `@field:hello world`.
12* Numeric Range matches on numeric fields with the syntax `@field:[{min} {max}]`.
13* Geo radius matches on geo fields with the syntax `@field:[{lon} {lat} {radius} {m|km|mi|ft}]`
14* Tag field filters with the syntax `@field:{tag | tag | ...}`. See the full documentation on [tag fields|/Tags].
15* Optional terms or clauses: `foo ~bar` means bar is optional but documents with bar in them will rank higher.
16* Fuzzy matching on terms (as of v1.2.0): `%hello%` means all terms with Levenshtein distance of 1 from it.
17* An expression in a query can be wrapped in parentheses to disambiguate, e.g. `(hello|hella) (world|werld)`.
18* Query attributes can be applied to individual clauses, e.g. `(foo bar) => { $weight: 2.0; $slop: 1; $inorder: false; }`
19* Combinations of the above can be used together, e.g `hello (world|foo) "bar baz" bbbb`
20
21## Pure negative queries
22
23As of version 0.19.3 it is possible to have a query consisting of just a negative expression, e.g. `-hello` or `-(@title:foo|bar)`. The results will be all the documents *NOT* containing the query terms.
24
25!!! warning
26    Any complex expression can be negated this way, however, caution should be taken here: if a negative expression has little or no results, this is equivalent to traversing and ranking all the documents in the index, which can be slow and cause high CPU consumption.
27
28## Field modifiers
29
30As of version 0.12 it is possible to specify field modifiers in the query and not just using the INFIELDS global keyword.
31
32Per query expression or sub-expression, it is possible to specify which fields it matches, by prepending the expression with the `@` symbol, the field name and a `:` (colon) symbol.
33
34If a field modifier precedes multiple words, they are considered to be a phrase with the same modifier.
35
36If a field modifier precedes an expression in parentheses, it applies only to the expression inside the parentheses.
37
38Multiple modifiers can be combined to create complex filtering on several fields. For example, if we have an index of car models, with a vehicle class, country of origin and engine type, we can search for SUVs made in Korea with hybrid or diesel engines - with the following query:
39
40```
41FT.SEARCH cars "@country:korea @engine:(diesel|hybrid) @class:suv"
42```
43
44Multiple modifiers can be applied to the same term or grouped terms. e.g.:
45
46```
47FT.SEARCH idx "@title|body:(hello world) @url|image:mydomain"
48```
49
50This will search for documents that have "hello world" either in the body or the title, and the term "mydomain" in their url or image fields.
51
52## Numeric filters in query
53
54If a field in the schema is defined as NUMERIC, it is possible to either use the FILTER argument in the Redis request or filter with it by specifying filtering rules in the query. The syntax is `@field:[{min} {max}]` - e.g. `@price:[100 200]`.
55
56### A few notes on numeric predicates
57
581. It is possible to specify a numeric predicate as the entire query, whereas it is impossible to do it with the FILTER argument.
59
602. It is possible to intersect or union multiple numeric filters in the same query, be it for the same field or different ones.
61
623. `-inf`, `inf` and `+inf` are acceptable numbers in a range. Thus greater-than 100 is expressed as `[(100 inf]`.
63
644. Numeric filters are inclusive. Exclusive min or max are expressed with `(` prepended to the number, e.g. `[(100 (200]`.
65
665. It is possible to negate a numeric filter by prepending a `-` sign to the filter, e.g. returning a result where price differs from 100 is expressed as: `@title:foo -@price:[100 100]`.
67
68## Tag filters
69
70RediSearch (starting with version 0.91) allows a special field type called "tag field", with simpler tokenization and encoding in the index. The values in these fields cannot be accessed by general field-less search, and can be used only with a special syntax:
71
72```
73@field:{ tag | tag | ...}
74
75e.g.
76
77@cities:{ New York | Los Angeles | Barcelona }
78```
79
80Tags can have multiple words or include other punctuation marks other than the field's separator (`,` by default). Punctuation marks in tags should be escaped with a backslash (`\`). It is also recommended (but not mandatory) to escape spaces; The reason is that if a multi-word tag includes stopwords, it will create a syntax error. So tags like "to be or not to be" should be escaped as "to\ be\ or\ not\ to\ be". For good measure, you can escape all spaces within tags.
81
82Notice that multiple tags in the same clause create a union of documents containing either tags. To create an intersection of documents containing *all* tags, you should repeat the tag filter several times, e.g.:
83
84```
85# This will return all documents containing all three cities as tags:
86@cities:{ New York } @cities:{Los Angeles} @cities:{ Barcelona }
87
88# This will return all documents containing either city:
89@cities:{ New York | Los Angeles | Barcelona }
90```
91
92Tag clauses can be combined into any sub-clause, used as negative expressions, optional expressions, etc.
93
94## Geo filters in query
95
96As of version 0.21, it is possible to add geo radius queries directly into the query language  with the syntax `@field:[{lon} {lat} {radius} {m|km|mi|ft}]`. This filters the result to a given radius from a lon,lat point, defined in meters, kilometers, miles or feet. See Redis' own [`GEORADIUS`](https://redis.io/commands/georadius) command for more details as it is used internally for that).
97
98Radius filters can be added into the query just like numeric filters. For example, in a database of businesses, looking for Chinese restaurants near San Francisco (within a 5km radius) would be expressed as: `chinese restaurant @location:[-122.41 37.77 5 km]`.
99
100## Prefix matching
101
102On index updating, we maintain a dictionary of all terms in the index. This can be used to match all terms starting with a given prefix. Selecting prefix matches is done by appending `*` to a prefix token. For example:
103
104```
105hel* world
106```
107
108Will be expanded to cover `(hello|help|helm|...) world`.
109
110### A few notes on prefix searches
111
1121. As prefixes can be expanded into many many terms, use them with caution. There is no magic going on, the expansion will create a Union operation of all suffixes.
113
1142. As a protective measure to avoid selecting too many terms, and block redis, which is single threaded, there are two limitations on prefix matching:
115
116  * Prefixes are limited to 2 letters or more. You can change this number by using the `MINPREFIX` setting on the module command line.
117
118  * Expansion is limited to 200 terms or less. You can change this number by using the `MAXEXPANSIONS` setting on the module command line.
119
1203. Prefix matching fully supports Unicode and is case insensitive.
121
1224. Currently, there is no sorting or bias based on suffix popularity, but this is on the near-term roadmap.
123
124## Fuzzy matching
125
126As of v1.2.0, the dictionary of all terms in the index can also be used to perform [Fuzzy Matching](https://en.wikipedia.org/wiki/Approximate_string_matching). Fuzzy matches are performed based on [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) (LD). Fuzzy matching on a term is performed by surrounding the term with '%', for example:
127
128```
129%hello% world
130```
131
132Will perform fuzzy matching on 'hello' for all terms where LD is 1.
133
134As of v1.4.0, the LD of the fuzzy match can be set by the number of '%' surrounding it, so that `%%hello%%` will perform fuzzy matching on 'hello' for all terms where LD is 2.
135
136The maximal LD for fuzzy matching is 3.
137
138## Wildcard queries
139
140As of version 1.1.0, we provide a special query to retrieve all the documents in an index. This is meant mostly for the aggregation engine. You can call it by specifying only a single star sign as the query string - i.e. `FT.SEARCH myIndex *`.
141
142This cannot be combined with any other filters, field modifiers or anything inside the query. It is technically possible to use the deprecated FILTER and GEOFILTER request parameters outside the query string in conjunction with a wildcard, but this makes the wildcard meaningless and only hurts performance.
143
144## Query attributes
145
146As of version 1.2.0, it is possible to apply specific query modifying attributes to specific clauses of the query.
147
148The syntax is `(foo bar) => { $attribute: value; $attribute:value; ...}`, e.g:
149
150```
151(foo bar) => { $weight: 2.0; $slop: 1; $inorder: true; }
152~(bar baz) => { $weight: 0.5; }
153```
154
155The supported attributes are:
156
157* **$weight**: determines the weight of the sub-query or token in the overall ranking on the result (default: 1.0).
1582. **$slop**: determines the maximum allowed "slop" (space between terms) in the query clause (default: 0).
1593. **$inorder**: whether or not the terms in a query clause must appear in the same order as in the query, usually set alongside with `$slop` (default: false).
1604. **$phonetic**: whether or not to perform phonetic matching (default: true). Note: setting this attribute on for fields which were not creates as `PHONETIC` will produce an error.
161
162## A few query examples
163
164* Simple phrase query - hello AND world
165
166        hello world
167
168* Exact phrase query - **hello** FOLLOWED BY **world**
169
170        "hello world"
171
172* Union: documents containing either **hello** OR **world**
173
174        hello|world
175
176* Not: documents containing **hello** but not **world**
177
178        hello -world
179
180* Intersection of unions
181
182        (hello|halo) (world|werld)
183
184* Negation of union
185
186        hello -(world|werld)
187
188* Union inside phrase
189
190        (barack|barrack) obama
191
192* Optional terms with higher priority to ones containing more matches:
193
194        obama ~barack ~michelle
195
196* Exact phrase in one field, one word in another field:
197
198        @title:"barack obama" @job:president
199
200* Combined AND, OR with field specifiers:
201
202        @title:hello world @body:(foo bar) @category:(articles|biographies)
203
204* Prefix Queries:
205
206        hello worl*
207
208        hel* worl*
209
210        hello -worl*
211
212* Numeric Filtering - products named "tv" with a price range of 200-500:
213
214        @name:tv @price:[200 500]
215
216* Numeric Filtering - users with age greater than 18:
217
218        @age:[(18 +inf]
219
220## Mapping common SQL predicates to RediSearch
221
222| SQL Condition | RediSearch Equivalent | Comments |
223|---------------|-----------------------|----------|
224| WHERE x='foo' AND y='bar' | @x:foo @y:bar | for less ambiguity use (@x:foo) (@y:bar) |
225| WHERE x='foo' AND y!='bar' | @x:foo -@y:bar |
226| WHERE x='foo' OR y='bar' | (@x:foo)\|(@y:bar) |
227| WHERE x IN ('foo', 'bar','hello world') | @x:(foo\|bar\|"hello world") | quotes mean exact phrase |
228| WHERE y='foo' AND x NOT IN ('foo','bar') | @y:foo (-@x:foo) (-@x:bar) |
229| WHERE x NOT IN ('foo','bar') | -@x:(foo\|bar) |
230| WHERE num BETWEEN 10 AND 20 | @num:[10 20] |
231| WHERE num >= 10 | @num:[10 +inf] |
232| WHERE num > 10 | @num:[(10 +inf] |
233| WHERE num < 10 | @num:[-inf (10] |
234| WHERE num <= 10 | @num:[-inf 10] |
235| WHERE num < 10 OR num > 20 | @num:[-inf (10] \| @num:[(20 +inf] |
236| WHERE name LIKE 'john%' | @name:john* |
237
238## Technical note
239
240The query parser is built using the Lemon Parser Generator and a Ragel based lexer. You can see the grammar definition [at the git repo](https://github.com/RediSearch/RediSearch/blob/master/src/query_parser/parser.y).
241