1.. Licensed under the Apache License, Version 2.0 (the "License"); you may not
2.. use this file except in compliance with the License. You may obtain a copy of
3.. the License at
4..
5..   http://www.apache.org/licenses/LICENSE-2.0
6..
7.. Unless required by applicable law or agreed to in writing, software
8.. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
9.. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
10.. License for the specific language governing permissions and limitations under
11.. the License.
12
13.. _replication/intro:
14
15===========================
16Introduction to Replication
17===========================
18
19One of CouchDB's strengths is the ability to synchronize two copies of the same
20database. This enables users to distribute data across several nodes or
21data centers, but also to move data more closely to clients.
22
23Replication involves a source and a destination database, which can be on the
24same or on different CouchDB instances. The aim of replication is that at
25the end of the process, all active documents in the source database are also in
26the destination database and all documents that were deleted in the source
27database are also deleted in the destination database (if they even existed).
28
29Transient and Persistent Replication
30====================================
31
32There are two different ways to set up a replication. The first one that was
33introduced into CouchDB leads to a replication that could be called `transient`.
34Transient means that there are no documents backing up the replication. So after a
35restart of the CouchDB server the replication will disappear. Later, the
36:ref:`_replicator <replicator>` database was introduced, which keeps documents
37containing your replication parameters. Such a replication can be called `persistent`.
38Transient replications were kept for backward compatibility. Both replications can
39have different :ref:`replication states <replicator/states>`.
40
41Triggering, Stopping and Monitoring Replications
42================================================
43
44A persistent replication is controlled through a document in the
45:ref:`_replicator <replicator>` database, where each document describes one
46replication process (see :ref:`replication-settings`). For setting up a
47transient replication the api endpoint
48:ref:`/_replicate <api/server/replicate>` can be used. A replication is triggered
49by sending a JSON object either to the ``_replicate`` endpoint or storing it as a
50document into the ``_replicator`` database.
51
52If a replication is currently running its status can be inspected through the
53active tasks API (see :ref:`api/server/active_tasks`, :ref:`replication-status`
54and :ref:`api/server/_scheduler/jobs`).
55
56For document based-replications, :ref:`api/server/_scheduler/docs` can be used to
57get a complete state summary. This API is preferred as it will show the state of the
58replication document before it becomes a replication job.
59
60For transient replications there is no way to query their state when the job is
61finished.
62
63A replication can be stopped by deleting the document, or by updating it with
64its ``cancel`` property set to ``true``.
65
66Replication Procedure
67=====================
68
69During replication, CouchDB will compare the source and the destination
70database to determine which documents differ between the source and the
71destination database. It does so by following the :ref:`changes` on the source
72and comparing the documents to the destination. Changes are submitted to the
73destination in batches where they can introduce conflicts. Documents that
74already exist on the destination in the same revision are not transferred. As
75the deletion of documents is represented by a new revision, a document deleted
76on the source will also be deleted on the target.
77
78A replication task will finish once it reaches the end of the changes feed. If
79its ``continuous`` property is set to true, it will wait for new changes to
80appear until the task is canceled. Replication tasks also create checkpoint
81documents on the destination to ensure that a restarted task can continue from
82where it stopped, for example after it has crashed.
83
84When a replication task is initiated on the sending node, it is called *push*
85replication, if it is initiated by the receiving node, it is called *pull*
86replication.
87
88Master - Master replication
89===========================
90
91One replication task will only transfer changes in one direction. To achieve
92master-master replication, it is possible to set up two replication tasks in
93opposite direction. When a change is replicated from database A to B by the
94first task, the second task from B to A will discover that the new change on
95B already exists in A and will wait for further changes.
96
97Controlling which Documents to Replicate
98========================================
99
100There are three options for controlling which documents are replicated,
101and which are skipped:
102
1031. Defining documents as being local.
1042. Using :ref:`selectorobj`.
1053. Using :ref:`filterfun`.
106
107Local documents are never replicated (see :ref:`api/local`).
108
109:ref:`selectorobj` can be included in a replication document (see
110:ref:`replication-settings`). A selector object contains a query expression
111that is used to test whether a document should be replicated.
112
113:ref:`filterfun` can be used in a replication (see
114:ref:`replication-settings`). The replication task evaluates
115the filter function for each document in the changes feed. The document is
116only replicated if the filter returns ``true``.
117
118.. note::
119    Using a selector provides performance benefits when compared with using a
120    :ref:`filterfun`. You should use :ref:`selectorobj` where possible.
121
122.. note::
123    When using replication filters that depend on the document's content,
124    deleted documents may pose a problem, since the document passed to the
125    filter will not contain any of the document's content. This can be
126    resolved by adding a ``_deleted:true`` field to the document instead
127    of using the DELETE HTTP method, paired with the use of a
128    :ref:`validate document update <vdufun>` handler to ensure the fields
129    required for replication filters are always present. Take note, though,
130    that the deleted document will still contain all of its data (including
131    attachments)!
132
133Migrating Data to Clients
134=========================
135
136Replication can be especially useful for bringing data closer to clients.
137`PouchDB <http://pouchdb.com/>`_ implements the replication algorithm of CouchDB
138in JavaScript, making it possible to make data from a CouchDB database
139available in an offline browser application, and synchronize changes back to
140CouchDB.
141