1.. Licensed under the Apache License, Version 2.0 (the "License"); you may not 2.. use this file except in compliance with the License. You may obtain a copy of 3.. the License at 4.. 5.. http://www.apache.org/licenses/LICENSE-2.0 6.. 7.. Unless required by applicable law or agreed to in writing, software 8.. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 9.. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 10.. License for the specific language governing permissions and limitations under 11.. the License. 12 13.. _ddoc/search: 14 15====== 16Search 17====== 18 19Search indexes enable you to query a database by using the 20`Lucene Query Parser Syntax. <http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Overview>`_ 21A search index uses one, or multiple, fields from your documents. You can use a search 22index to run queries, find documents based on the content they contain, or work with 23groups, facets, or geographical searches. 24 25.. warning:: 26 Search cannot function unless it has a functioning, cluster-connected 27 Clouseau instance. See :ref:`Search Plugin Installation <install/search>` 28 for details. 29 30To create a search index, you add a JavaScript function to a design document in the 31database. An index builds after processing one search request or after the server detects 32a document update. The ``index`` function takes the following parameters: 33 341. Field name - The name of the field you want to use when you query the index. If you 35set this parameter to ``default``, then this field is queried if no field is specified in 36the query syntax. 37 382. Data that you want to index, for example, ``doc.address.country``. 39 403. (Optional) The third parameter includes the following fields: ``boost``, ``facet``, 41``index``, and ``store``. These fields are described in more detail later. 42 43By default, a search index response returns 25 rows. The number of rows that is returned 44can be changed by using the ``limit`` parameter. Each response includes a ``bookmark`` 45field. You can include the value of the ``bookmark`` field in later queries to look 46through the responses. 47 48*Example design document that defines a search index:* 49 50.. code-block:: javascript 51 52 { 53 "_id": "_design/search_example", 54 "indexes": { 55 "animals": { 56 "index": "function(doc){ ... }" 57 } 58 } 59 } 60 61A search index will inherit the partitioning type from the ``options.partitioned`` field 62of the design document that contains it. 63 64Index functions 65=============== 66 67Attempting to index by using a data field that does not exist fails. To avoid 68this problem, use the appropriate 69:ref:`guard clause <ddoc/search/index_guard_clauses>`. 70 71.. note:: 72 Your indexing functions operate in a memory-constrained environment 73 where the document itself forms a part of the memory that is used 74 in that environment. Your code's stack and document must fit inside this 75 memory. In other words, a document must be loaded in order to be indexed. 76 Documents are limited to a maximum size of 64 MB. 77 78.. note:: 79 Within a search index, do not index the same field name with more than one data 80 type. If the same field name is indexed with different data types in the same search 81 index function, you might get an error when querying the search index that says the 82 field "was indexed without position data." For example, do not include both of these 83 lines in the same search index function, as they index the ``myfield`` field as two 84 different data types: a string ``"this is a string"`` and a number ``123``. 85 86.. code-block:: javascript 87 88 index("myfield", "this is a string"); 89 index("myfield", 123); 90 91The function that is contained in the index field is a JavaScript function 92that is called for each document in the database. 93The function takes the document as a parameter, 94extracts some data from it, and then calls the function that is defined 95in the ``index`` field to index that data. 96 97The ``index`` function takes three parameters, where the third parameter is optional. 98 99The first parameter is the name of the field you intend to use when querying the index, 100and which is specified in the Lucene syntax portion of subsequent queries. 101An example appears in the following query: 102 103.. code-block:: javascript 104 105 query=color:red 106 107The Lucene field name ``color`` is the first parameter of the ``index`` function. 108 109The ``query`` parameter can be abbreviated to ``q``, 110so another way of writing the query is as follows: 111 112.. code-block:: javascript 113 114 q=color:red 115 116If the special value ``"default"`` is used when you define the name, 117you do not have to specify a field name at query time. 118The effect is that the query can be simplified: 119 120.. code-block:: javascript 121 122 query=red 123 124The second parameter is the data to be indexed. Keep the following information 125in mind when you index your data: 126 127- This data must be only a string, number, or boolean. Other types will cause 128 an error to be thrown by the index function call. 129 130- If an error is thrown when running your function, for this reason or others, 131 the document will not be added to that search index. 132 133The third, optional, parameter is a JavaScript object with the following fields: 134 135*Index function (optional parameter)* 136 137* **boost** - A number that specifies the relevance in search results. Content that is 138 indexed with a boost value greater than 1 is more relevant than content that is 139 indexed without a boost value. Content with a boost value less than one is not so 140 relevant. Value is a positive floating point number. Default is 1 (no boosting). 141 142* **facet** - Creates a faceted index. See :ref:`Faceting <ddoc/search/faceting>`. 143 Values are ``true`` or ``false``. Default is ``false``. 144 145* **index** - Whether the data is indexed, and if so, how. If set to ``false``, the data 146 cannot be used for searches, but can still be retrieved from the index if ``store`` is 147 set to ``true``. See :ref:`Analyzers <ddoc/search/analyzers>`. 148 Values are ``true`` or ``false``. Default is ``true`` 149 150* **store** - If ``true``, the value is returned in the search result; otherwise, 151 the value is not returned. Values are ``true`` or ``false``. Default is ``false``. 152 153.. note:: 154 155 If you do not set the ``store`` parameter, 156 the index data results for the document are not returned in response to a query. 157 158*Example search index function:* 159 160.. code-block:: javascript 161 162 function(doc) { 163 index("default", doc._id); 164 if (doc.min_length) { 165 index("min_length", doc.min_length, {"store": true}); 166 } 167 if (doc.diet) { 168 index("diet", doc.diet, {"store": true}); 169 } 170 if (doc.latin_name) { 171 index("latin_name", doc.latin_name, {"store": true}); 172 } 173 if (doc.class) { 174 index("class", doc.class, {"store": true}); 175 } 176 } 177 178.. _ddoc/search/index_guard_clauses: 179 180Index guard clauses 181------------------- 182 183The ``index`` function requires the name of the data field to index as the second 184parameter. However, if that data field does not exist for the document, an error occurs. 185The solution is to use an appropriate 'guard clause' that checks if the field exists, and 186contains the expected type of data, *before* any attempt to create the corresponding 187index. 188 189*Example of failing to check whether the index data field exists:* 190 191.. code-block:: javascript 192 193 if (doc.min_length) { 194 index("min_length", doc.min_length, {"store": true}); 195 } 196 197You might use the JavaScript ``typeof`` function to implement the guard clause test. If 198the field exists *and* has the expected type, the correct type name is returned, so the 199guard clause test succeeds and it is safe to use the index function. If the field does 200*not* exist, you would not get back the expected type of the field, therefore you would 201not attempt to index the field. 202 203JavaScript considers a result to be false if one of the following values is tested: 204 205* 'undefined' 206* null 207* The number +0 208* The number -0 209* NaN (not a number) 210* "" (the empty string) 211 212*Using a guard clause to check whether the required data field exists, and holds a number, 213before an attempt to index:* 214 215.. code-block:: javascript 216 217 if (typeof(doc.min_length) === 'number') { 218 index("min_length", doc.min_length, {"store": true}); 219 } 220 221Use a generic guard clause test to ensure that the type of the candidate data field is 222defined. 223 224*Example of a 'generic' guard clause:* 225 226.. code-block:: javascript 227 228 if (typeof(doc.min_length) !== 'undefined') { 229 // The field exists, and does have a type, so we can proceed to index using it. 230 ... 231 } 232 233.. _ddoc/search/analyzers: 234 235Analyzers 236========= 237 238Analyzers are settings that define how to recognize terms within text. Analyzers can be 239helpful if you need to 240:ref:`index multiple languages <ddoc/search/language-specific-analyzers>`. 241 242Here's the list of generic analyzers, and their descriptions, that are supported by 243search: 244 245- ``classic`` - The standard Lucene analyzer, circa release 3.1. 246- ``email`` - Like the ``standard`` analyzer, but tries harder to 247 match an email address as a complete token. 248- ``keyword`` - Input is not tokenized at all. 249- ``simple`` - Divides text at non-letters. 250- ``standard`` - The default analyzer. It implements the Word Break 251 rules from the `Unicode Text Segmentation algorithm <http://www.unicode.org/reports/tr29/>`_ 252- ``whitespace`` - Divides text at white space boundaries. 253 254*Example analyzer document:* 255 256.. code-block:: javascript 257 258 { 259 "_id": "_design/analyzer_example", 260 "indexes": { 261 "INDEX_NAME": { 262 "index": "function (doc) { ... }", 263 "analyzer": "$ANALYZER_NAME" 264 } 265 } 266 } 267 268.. _ddoc/search/language-specific-analyzers: 269 270Language-specific analyzers 271--------------------------- 272 273These analyzers omit common words in the specific language, 274and many also `remove prefixes and suffixes <http://en.wikipedia.org/wiki/Stemming>`_. 275The name of the language is also the name of the analyzer. See 276`package org.apache.lucene.analysis <https://lucene.apache.org/core/4_6_1/core/org/apache/lucene/analysis/package-summary.html>`_ 277for more information. 278 279+----------------+----------------------------------------------------------+ 280| Language | Analyzer | 281+================+==========================================================+ 282| ``arabic`` | org.apache.lucene.analysis.ar.ArabicAnalyzer | 283+----------------+----------------------------------------------------------+ 284| ``armenian`` | org.apache.lucene.analysis.hy.ArmenianAnalyzer | 285+----------------+----------------------------------------------------------+ 286| ``basque`` | org.apache.lucene.analysis.eu.BasqueAnalyzer | 287+----------------+----------------------------------------------------------+ 288| ``bulgarian`` | org.apache.lucene.analysis.bg.BulgarianAnalyzer | 289+----------------+----------------------------------------------------------+ 290| ``brazilian`` | org.apache.lucene.analysis.br.BrazilianAnalyzer | 291+----------------+----------------------------------------------------------+ 292| ``catalan`` | org.apache.lucene.analysis.ca.CatalanAnalyzer | 293+----------------+----------------------------------------------------------+ 294| ``cjk`` | org.apache.lucene.analysis.cjk.CJKAnalyzer | 295+----------------+----------------------------------------------------------+ 296| ``chinese`` | org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer | 297+----------------+----------------------------------------------------------+ 298| ``czech`` | org.apache.lucene.analysis.cz.CzechAnalyzer | 299+----------------+----------------------------------------------------------+ 300| ``danish`` | org.apache.lucene.analysis.da.DanishAnalyzer | 301+----------------+----------------------------------------------------------+ 302| ``dutch`` | org.apache.lucene.analysis.nl.DutchAnalyzer | 303+----------------+----------------------------------------------------------+ 304| ``english`` | org.apache.lucene.analysis.en.EnglishAnalyzer | 305+----------------+----------------------------------------------------------+ 306| ``finnish`` | org.apache.lucene.analysis.fi.FinnishAnalyzer | 307+----------------+----------------------------------------------------------+ 308| ``french`` | org.apache.lucene.analysis.fr.FrenchAnalyzer | 309+----------------+----------------------------------------------------------+ 310| ``german`` | org.apache.lucene.analysis.de.GermanAnalyzer | 311+----------------+----------------------------------------------------------+ 312| ``greek`` | org.apache.lucene.analysis.el.GreekAnalyzer | 313+----------------+----------------------------------------------------------+ 314| ``galician`` | org.apache.lucene.analysis.gl.GalicianAnalyzer | 315+----------------+----------------------------------------------------------+ 316| ``hindi`` | org.apache.lucene.analysis.hi.HindiAnalyzer | 317+----------------+----------------------------------------------------------+ 318| ``hungarian`` | org.apache.lucene.analysis.hu.HungarianAnalyzer | 319+----------------+----------------------------------------------------------+ 320| ``indonesian`` | org.apache.lucene.analysis.id.IndonesianAnalyzer | 321+----------------+----------------------------------------------------------+ 322| ``irish`` | org.apache.lucene.analysis.ga.IrishAnalyzer | 323+----------------+----------------------------------------------------------+ 324| ``italian`` | org.apache.lucene.analysis.it.ItalianAnalyzer | 325+----------------+----------------------------------------------------------+ 326| ``japanese`` | org.apache.lucene.analysis.ja.JapaneseAnalyzer | 327+----------------+----------------------------------------------------------+ 328| ``japanese`` | org.apache.lucene.analysis.ja.JapaneseTokenizer | 329+----------------+----------------------------------------------------------+ 330| ``latvian`` | org.apache.lucene.analysis.lv.LatvianAnalyzer | 331+----------------+----------------------------------------------------------+ 332| ``norwegian`` | org.apache.lucene.analysis.no.NorwegianAnalyzer | 333+----------------+----------------------------------------------------------+ 334| ``persian`` | org.apache.lucene.analysis.fa.PersianAnalyzer | 335+----------------+----------------------------------------------------------+ 336| ``polish`` | org.apache.lucene.analysis.pl.PolishAnalyzer | 337+----------------+----------------------------------------------------------+ 338| ``portuguese`` | org.apache.lucene.analysis.pt.PortugueseAnalyzer | 339+----------------+----------------------------------------------------------+ 340| ``romanian`` | org.apache.lucene.analysis.ro.RomanianAnalyzer | 341+----------------+----------------------------------------------------------+ 342| ``russian`` | org.apache.lucene.analysis.ru.RussianAnalyzer | 343+----------------+----------------------------------------------------------+ 344| ``spanish`` | org.apache.lucene.analysis.es.SpanishAnalyzer | 345+----------------+----------------------------------------------------------+ 346| ``swedish`` | org.apache.lucene.analysis.sv.SwedishAnalyzer | 347+----------------+----------------------------------------------------------+ 348| ``thai`` | org.apache.lucene.analysis.th.ThaiAnalyzer | 349+----------------+----------------------------------------------------------+ 350| ``turkish`` | org.apache.lucene.analysis.tr.TurkishAnalyzer | 351+----------------+----------------------------------------------------------+ 352 353.. note:: 354 355 The ``japanese`` analyzer, org.apache.lucene.analysis.ja.JapaneseTokenizer, 356 includes DEFAULT_MODE and defaultStopTags. 357 358.. note:: 359 360 Language-specific analyzers are optimized for the specified language. You cannot 361 combine a generic analyzer with a language-specific analyzer. Instead, you might use a 362 :ref:`per field analyzer <ddoc/search/per-field-analyzers>` to select different 363 analyzers for different fields within the documents. 364 365.. _ddoc/search/per-field-analyzers: 366 367Per-field analyzers 368------------------- 369 370The ``perfield`` analyzer configures multiple analyzers for different fields. 371 372*Example of defining different analyzers for different fields:* 373 374.. code-block:: javascript 375 376 { 377 "_id": "_design/analyzer_example", 378 "indexes": { 379 "INDEX_NAME": { 380 "analyzer": { 381 "name": "perfield", 382 "default": "english", 383 "fields": { 384 "spanish": "spanish", 385 "german": "german" 386 } 387 }, 388 "index": "function (doc) { ... }" 389 } 390 } 391 } 392 393Stop words 394---------- 395 396Stop words are words that do not get indexed. You define them within a design document by 397turning the analyzer string into an object. 398 399.. note:: 400 401 The ``keyword``, ``simple``, and ``whitespace`` analyzers do not support stop words. 402 403The default stop words for the ``standard`` analyzer are included below: 404 405.. code-block:: javascript 406 407 "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", 408 "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", 409 "that", "the", "their", "then", "there", "these", "they", "this", 410 "to", "was", "will", "with" 411 412*Example of defining non-indexed ('stop') words:* 413 414.. code-block:: javascript 415 416 { 417 "_id": "_design/stop_words_example", 418 "indexes": { 419 "INDEX_NAME": { 420 "analyzer": { 421 "name": "portuguese", 422 "stopwords": [ 423 "foo", 424 "bar", 425 "baz" 426 ] 427 }, 428 "index": "function (doc) { ... }" 429 } 430 } 431 } 432 433Testing analyzer tokenization 434----------------------------- 435 436You can test the results of analyzer tokenization by posting sample data to the 437``_search_analyze`` endpoint. 438 439*Example of using HTTP to test the keyword analyzer:* 440 441.. code-block:: http 442 443 POST /_search_analyze HTTP/1.1 444 Content-Type: application/json 445 {"analyzer":"keyword", "text":"ablanks@renovations.com"} 446 447*Example of using the command line to test the keyword analyzer:* 448 449.. code-block:: sh 450 451 curl 'https://$HOST:5984/_search_analyze' -H 'Content-Type: application/json' 452 -d '{"analyzer":"keyword", "text":"ablanks@renovations.com"}' 453 454*Result of testing the keyword analyzer:* 455 456.. code-block:: javascript 457 458 { 459 "tokens": [ 460 "ablanks@renovations.com" 461 ] 462 } 463 464*Example of using HTTP to test the standard analyzer:* 465 466.. code-block:: http 467 468 POST /_search_analyze HTTP/1.1 469 Content-Type: application/json 470 {"analyzer":"standard", "text":"ablanks@renovations.com"} 471 472*Example of using the command line to test the standard analyzer:* 473 474.. code-block:: sh 475 476 curl 'https://$HOST:5984/_search_analyze' -H 'Content-Type: application/json' 477 -d '{"analyzer":"standard", "text":"ablanks@renovations.com"}' 478 479*Result of testing the standard analyzer:* 480 481.. code-block:: javascript 482 483 { 484 "tokens": [ 485 "ablanks", 486 "renovations.com" 487 ] 488 } 489 490Queries 491======= 492 493After you create a search index, you can query it. 494 495- Issue a partition query using: 496 ``GET /$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_search/$INDEX_NAME`` 497- Issue a global query using: 498 ``GET /$DATABASE/_design/$DDOC/_search/$INDEX_NAME`` 499 500Specify your search by using the ``query`` parameter. 501 502*Example of using HTTP to query a partitioned index:* 503 504.. code-block:: http 505 506 GET /$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_search/$INDEX_NAME?include_docs=true&query="*:*"&limit=1 HTTP/1.1 507 Content-Type: application/json 508 509*Example of using HTTP to query a global index:* 510 511.. code-block:: http 512 513 GET /$DATABASE/_design/$DDOC/_search/$INDEX_NAME?include_docs=true&query="*:*"&limit=1 HTTP/1.1 514 Content-Type: application/json 515 516*Example of using the command line to query a partitioned index:* 517 518.. code-block:: sh 519 520 curl https://$HOST:5984/$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/ 521 _search/$INDEX_NAME?include_docs=true\&query="*:*"\&limit=1 \ 522 523*Example of using the command line to query a global index:* 524 525.. code-block:: sh 526 527 curl https://$HOST:5984/$DATABASE/_design/$DDOC/_search/$INDEX_NAME? 528 include_docs=true\&query="*:*"\&limit=1 \ 529 530.. _ddoc/search/query_parameters: 531 532Query Parameters 533---------------- 534 535A full list of query parameters can be found in the 536:ref:`API Reference <api/ddoc/search>`. 537 538You must enable :ref:`faceting <ddoc/search/faceting>` before you can use the 539following parameters: 540 541- ``counts`` 542- ``drilldown`` 543- ``ranges`` 544 545.. note:: 546 Do not combine the ``bookmark`` and ``stale`` options. These options 547 constrain the choice of shard replicas to use for the response. When used 548 together, the options might cause problems when contact is attempted 549 with replicas that are slow or not available. 550 551Relevance 552--------- 553 554When more than one result might be returned, it is possible for them to be sorted. By 555default, the sorting order is determined by 'relevance'. 556 557Relevance is measured according to 558`Apache Lucene Scoring <https://lucene.apache.org/core/3_6_0/scoring.html>`_. 559As an example, if you search a simple database for the word ``example``, two documents 560might contain the word. If one document mentions the word ``example`` 10 times, but the 561second document mentions it only twice, then the first document is considered to be more 562'relevant'. 563 564If you do not provide a ``sort`` parameter, relevance is used by default. The highest 565scoring matches are returned first. 566 567If you provide a ``sort`` parameter, then matches are returned in that order, ignoring 568relevance. 569 570If you want to use a ``sort`` parameter, and also include ordering by relevance in your 571search results, use the special fields ``-<score>`` or ``<score>`` within the ``sort`` 572parameter. 573 574POSTing search queries 575---------------------- 576 577Instead of using the ``GET`` HTTP method, you can also use ``POST``. The main advantage of 578``POST`` queries is that they can have a request body, so you can specify the request as a 579JSON object. Each parameter in the query string of a ``GET`` request corresponds to a 580field in the JSON object in the request body. 581 582*Example of using HTTP to POST a search request:* 583 584.. code-block:: http 585 586 POST /db/_design/ddoc/_search/searchname HTTP/1.1 587 Content-Type: application/json 588 589*Example of using the command line to POST a search request:* 590 591.. code-block:: sh 592 593 curl 'https://$HOST:5984/db/_design/ddoc/_search/searchname' -X POST -H 'Content-Type: application/json' -d @search.json 594 595*Example JSON document that contains a search request:* 596 597.. code-block:: javascript 598 599 { 600 "q": "index:my query", 601 "sort": "foo", 602 "limit": 3 603 } 604 605Query syntax 606============ 607 608The CouchDB search query syntax is based on the 609`Lucene syntax. <http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Overview>`_ 610Search queries take the form of ``name:value`` unless the name is omitted, in which case 611they use the default field, as demonstrated in the following examples: 612 613*Example search query expressions:* 614 615.. code-block:: javascript 616 617 // Birds 618 class:bird 619 620.. code-block:: text 621 622 // Animals that begin with the letter "l" 623 l* 624 625.. code-block:: text 626 627 // Carnivorous birds 628 class:bird AND diet:carnivore 629 630.. code-block:: text 631 632 // Herbivores that start with letter "l" 633 l* AND diet:herbivore 634 635.. code-block:: text 636 637 // Medium-sized herbivores 638 min_length:[1 TO 3] AND diet:herbivore 639 640.. code-block:: text 641 642 // Herbivores that are 2m long or less 643 diet:herbivore AND min_length:[-Infinity TO 2] 644 645.. code-block:: text 646 647 // Mammals that are at least 1.5m long 648 class:mammal AND min_length:[1.5 TO Infinity] 649 650.. code-block:: text 651 652 // Find "Meles meles" 653 latin_name:"Meles meles" 654 655.. code-block:: text 656 657 // Mammals who are herbivore or carnivore 658 diet:(herbivore OR omnivore) AND class:mammal 659 660.. code-block:: text 661 662 // Return all results 663 *:* 664 665Queries over multiple fields can be logically combined, and groups and fields can be 666further grouped. The available logical operators are case-sensitive and are ``AND``, 667``+``, ``OR``, ``NOT`` and ``-``. Range queries can run over strings or numbers. 668 669If you want a fuzzy search, you can run a query with ``~`` to find terms like the search 670term. For instance, ``look~`` finds the terms ``book`` and ``took``. 671 672.. note:: 673 If the lower and upper bounds of a range query are both strings that 674 contain only numeric digits, the bounds are treated as numbers not as 675 strings. For example, if you search by using the query 676 ``mod_date:["20170101" TO "20171231"]``, the results include documents 677 for which ``mod_date`` is between the numeric values 20170101 and 678 20171231, not between the strings "20170101" and "20171231". 679 680You can alter the importance of a search term by adding ``^`` and a positive number. This 681alteration makes matches containing the term more or less relevant, proportional to the 682power of the boost value. The default value is 1, which means no increase or decrease in 683the strength of the match. A decimal value of 0 - 1 reduces importance. making the match 684strength weaker. A value greater than one increases importance, making the match strength 685stronger. 686 687Wildcard searches are supported, for both single (``?``) and multiple (``*``) character 688searches. For example, ``dat?`` would match ``date`` and ``data``, whereas ``dat*`` would 689match ``date``, ``data``, ``database``, and ``dates``. Wildcards must come after the 690search term. 691 692Use ``*:*`` to return all results. 693 694If the search query does *not* specify the ``"group_field"`` argument, the response 695contains a bookmark. If this bookmark is later provided as a URL parameter, the response 696skips the rows that were seen already, making it quick and easy to get the next set of 697results. 698 699.. note:: 700 The response never includes a bookmark if the ``"group_field"`` 701 parameter is included in the search query. 702 See :ref:`group_field parameter <api/ddoc/search>`. 703 704.. note:: 705 The ``group_field``, ``group_limit``, and ``group_sort`` options 706 are only available when making global queries. 707 708The following characters require escaping if you want to search on them: 709 710.. code-block:: sh 711 712 + - && || ! ( ) { } [ ] ^ " ~ * ? : \ / 713 714To escape one of these characters, use a preceding backslash character (``\``). 715 716The response to a search query contains an ``order`` field for each of the results. The 717``order`` field is an array where the first element is the field or fields that are 718specified in the ``sort`` parameter. See the 719:ref:`sort parameter <api/ddoc/search>`. If no ``sort`` parameter is included 720in the query, then the ``order`` field contains the `Lucene relevance score 721<https://lucene.apache.org/core/3_6_0/scoring.html>`_. If you use the 'sort by distance' 722feature as described in :ref:`geographical searches <ddoc/search/geographical_searches>`, 723then the first element is the distance from a point. The distance is measured by using 724either kilometers or miles. 725 726.. note:: 727 The second element in the order array can be ignored. 728 It is used for troubleshooting purposes only. 729 730.. _ddoc/search/faceting: 731 732Faceting 733-------- 734 735CouchDB Search also supports faceted searching, enabling discovery of aggregate 736information about matches quickly and easily. You can match all documents by using the 737special ``?q=*:*`` query syntax, and use the returned facets to refine your query. To 738indicate that a field must be indexed for faceted queries, set ``{"facet": true}`` in its 739options. 740 741*Example of search query, specifying that faceted search is enabled:* 742 743.. code-block:: javascript 744 745 function(doc) { 746 index("type", doc.type, {"facet": true}); 747 index("price", doc.price, {"facet": true}); 748 } 749 750To use facets, all the documents in the index must include all the fields that have 751faceting enabled. If your documents do not include all the fields, you receive a 752``bad_request`` error with the following reason, "The ``field_name`` does not exist." If 753each document does not contain all the fields for facets, create separate indexes for each 754field. If you do not create separate indexes for each field, you must include only 755documents that contain all the fields. Verify that the fields exist in each document by 756using a single ``if`` statement. 757 758*Example if statement to verify that the required fields exist in each document:* 759 760.. code-block:: javascript 761 762 if (typeof doc.town == "string" && typeof doc.name == "string") { 763 index("town", doc.town, {facet: true}); 764 index("name", doc.name, {facet: true}); 765 } 766 767Counts 768------ 769 770.. note:: 771 The ``counts`` option is only available when making global queries. 772 773The ``counts`` facet syntax takes a list of fields, and returns the number of query 774results for each unique value of each named field. 775 776.. note:: 777 The ``count`` operation works only if the indexed values are strings. 778 The indexed values cannot be mixed types. For example, 779 if 100 strings are indexed, and one number, 780 then the index cannot be used for ``count`` operations. 781 You can check the type by using the ``typeof`` operator, and convert it 782 by using the ``parseInt``, 783 ``parseFloat``, or ``.toString()`` functions. 784 785*Example of a query using the counts facet syntax:* 786 787.. code-block:: http 788 789 ?q=*:*&counts=["type"] 790 791*Example response after using of the counts facet syntax:* 792 793.. code-block:: javascript 794 795 { 796 "total_rows":100000, 797 "bookmark":"g...", 798 "rows":[...], 799 "counts":{ 800 "type":{ 801 "sofa": 10, 802 "chair": 100, 803 "lamp": 97 804 } 805 } 806 } 807 808Drilldown 809------------- 810 811.. note:: 812 The ``drilldown`` option is only available when making global queries. 813 814You can restrict results to documents with a dimension equal to the specified label. 815Restrict the results by adding ``drilldown=["dimension","label"]`` to a search query. You 816can include multiple ``drilldown`` parameters to restrict results along multiple 817dimensions. 818 819.. code-block:: http 820 821 GET /things/_design/inventory/_search/fruits?q=*:*&drilldown=["state","old"]&drilldown=["item","apple"]&include_docs=true HTTP/1.1 822 823For better language interoperability, you can achieve the same by supplying a list of lists: 824 825.. code-block:: http 826 827 GET /things/_design/inventory/_search/fruits?q=*:*&drilldown=[["state","old"],["item","apple"]]&include_docs=true HTTP/1.1 828 829You can also supply a list of lists for ``drilldown`` in bodies of POST requests. 830 831Note that, multiple values for a single key in a ``drilldown`` means an 832``OR`` relation between them and there is an ``AND`` relation between multiple keys. 833 834Using a ``drilldown`` parameter is similar to using ``key:value`` in the ``q`` parameter, 835but the ``drilldown`` parameter returns values that the analyzer might skip. 836 837For example, if the analyzer did not index a stop word like ``"a"``, using ``drilldown`` 838returns it when you specify ``drilldown=["key","a"]``. 839 840Ranges 841------ 842 843.. note:: 844 The ``ranges`` option is only available when making global queries. 845 846The ``range`` facet syntax reuses the standard Lucene syntax for ranges to return counts 847of results that fit into each specified category. Inclusive range queries are denoted by 848brackets (``[``, ``]``). Exclusive range queries are denoted by curly brackets (``{``, 849``}``). 850 851.. note:: 852 The ``range`` operation works only if the indexed values are numbers. The indexed 853 values cannot be mixed types. For example, if 100 strings are indexed, and one number, 854 then the index cannot be used for ``range`` operations. You can check the type by 855 using the ``typeof`` operator, and convert it by using the ``parseInt``, 856 ``parseFloat``, or ``.toString()`` functions. 857 858*Example of a request that uses faceted search for matching ranges:* 859 860.. code-block:: http 861 862 ?q=*:*&ranges={"price":{"cheap":"[0 TO 100]","expensive":"{100 TO Infinity}"}} 863 864*Example results after a ranges check on a faceted search:* 865 866.. code-block:: javascript 867 868 { 869 "total_rows":100000, 870 "bookmark":"g...", 871 "rows":[...], 872 "ranges": { 873 "price": { 874 "expensive": 278682, 875 "cheap": 257023 876 } 877 } 878 } 879 880.. _ddoc/search/geographical_searches: 881 882Geographical searches 883===================== 884 885In addition to searching by the content of textual fields, you can also sort your results 886by their distance from a geographic coordinate using Lucene's built-in geospatial 887capabilities. 888 889To sort your results in this way, you must index two numeric fields, representing the 890longitude and latitude. 891 892.. note:: 893 You can also sort your results by their distance from a geographic coordinate 894 using Lucene's built-in geospatial capabilities. 895 896You can then query by using the special ``<distance...>`` sort field, which takes five 897parameters: 898 899- Longitude field name: The name of your longitude field (``mylon`` in the example). 900 901- Latitude field name: The name of your latitude field (``mylat`` in the example). 902 903- Longitude of origin: The longitude of the place you want to sort by distance from. 904 905- Latitude of origin: The latitude of the place you want to sort by distance from. 906 907- Units: The units to use: ``km`` for kilometers or ``mi`` for miles. 908 The distance is returned in the order field. 909 910You can combine sorting by distance with any other search query, such as range searches on 911the latitude and longitude, or queries that involve non-geographical information. 912 913That way, you can search in a bounding box, and narrow down the search with extra 914criteria. 915 916*Example geographical data:* 917 918.. code-block:: javascript 919 920 { 921 "name":"Aberdeen, Scotland", 922 "lat":57.15, 923 "lon":-2.15, 924 "type":"city" 925 } 926 927*Example of a design document that contains a search index for the geographic data:* 928 929.. code-block:: javascript 930 931 function(doc) { 932 if (doc.type && doc.type == 'city') { 933 index('city', doc.name, {'store': true}); 934 index('lat', doc.lat, {'store': true}); 935 index('lon', doc.lon, {'store': true}); 936 } 937 } 938 939*An example of using HTTP for a query that sorts cities in the northern hemisphere by 940their distance to New York:* 941 942.. code-block:: http 943 944 GET /examples/_design/cities-designdoc/_search/cities?q=lat:[0+TO+90]&sort="<distance,lon,lat,-74.0059,40.7127,km>" HTTP/1.1 945 946*An example of using the command line for a query that sorts cities in the northern 947hemisphere by their distance to New York:* 948 949.. code-block:: sh 950 951 curl 'https://$HOST:5984/examples/_design/cities-designdoc/_search/cities?q=lat:[0+TO+90]&sort="<distance,lon,lat,-74.0059,40.7127,km>"' 952 953*Example (abbreviated) response, containing a list of northern hemisphere 954cities sorted by distance to New York:* 955 956.. code-block:: javascript 957 958 { 959 "total_rows": 205, 960 "bookmark": "g1A...XIU", 961 "rows": [ 962 { 963 "id": "city180", 964 "order": [ 965 8.530665755719783, 966 18 967 ], 968 "fields": { 969 "city": "New York, N.Y.", 970 "lat": 40.78333333333333, 971 "lon": -73.96666666666667 972 } 973 }, 974 { 975 "id": "city177", 976 "order": [ 977 13.756343205985946, 978 17 979 ], 980 "fields": { 981 "city": "Newark, N.J.", 982 "lat": 40.733333333333334, 983 "lon": -74.16666666666667 984 } 985 }, 986 { 987 "id": "city178", 988 "order": [ 989 113.53603438866077, 990 26 991 ], 992 "fields": { 993 "city": "New Haven, Conn.", 994 "lat": 41.31666666666667, 995 "lon": -72.91666666666667 996 } 997 } 998 ] 999 } 1000 1001Highlighting search terms 1002========================= 1003 1004Sometimes it is useful to get the context in which a search term was mentioned so that you 1005can display more emphasized results to a user. 1006 1007To get more emphasized results, add the ``highlight_fields`` parameter to the search 1008query. Specify the field names for which you would like excerpts, with the highlighted 1009search term returned. 1010 1011By default, the search term is placed in ``<em>`` tags to highlight it, but the highlight 1012can be overridden by using the ``highlights_pre_tag`` and ``highlights_post_tag`` 1013parameters. 1014 1015The length of the fragments is 100 characters by default. A different length can be 1016requested with the ``highlights_size`` parameter. 1017 1018The ``highlights_number`` parameter controls the number of fragments that are returned, 1019and defaults to 1. 1020 1021In the response, a ``highlights`` field is added, with one subfield per field name. 1022 1023For each field, you receive an array of fragments with the search term highlighted. 1024 1025.. note:: 1026 For highlighting to work, store the field in the index by 1027 using the ``store: true`` option. 1028 1029*Example of using HTTP to search with highlighting enabled:* 1030 1031.. code-block:: http 1032 1033 GET /movies/_design/searches/_search/movies?q=movie_name:Azazel&highlight_fields=["movie_name"]&highlight_pre_tag="**"&highlight_post_tag="**"&highlights_size=30&highlights_number=2 HTTP/1.1 1034 Authorization: ... 1035 1036*Example of using the command line to search with 1037highlighting enabled:* 1038 1039.. code-block:: sh 1040 1041 curl "https://$HOST:5984/movies/_design/searches/_search/movies?q=movie_name:Azazel&highlight_fields=\[\"movie_name\"\]&highlight_pre_tag=\"**\"&highlight_post_tag=\"**\"&highlights_size=30&highlights_number=2 1042 1043*Example of highlighted search results:* 1044 1045.. code-block:: javascript 1046 1047 { 1048 "highlights": { 1049 "movie_name": [ 1050 " on the Azazel Orient Express", 1051 " Azazel manuals, you" 1052 ] 1053 } 1054 } 1055