1<chapter id="overview" xreflabel="Overview">
2 <title>repmgr overview</title>
3
4 <para>
5  This chapter provides a high-level overview of &repmgr;'s components and
6  functionality.
7 </para>
8 <sect1 id="repmgr-concepts" xreflabel="Concepts">
9
10  <title>Concepts</title>
11
12  <indexterm>
13    <primary>concepts</primary>
14  </indexterm>
15
16  <para>
17   This guide assumes that you are familiar with PostgreSQL administration and
18   streaming replication concepts. For further details on streaming
19   replication, see the PostgreSQL documentation section on <ulink
20   url="https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION">
21   streaming replication</ulink>.
22  </para>
23  <para>
24   The following terms are used throughout the &repmgr; documentation.
25   <variablelist>
26    <varlistentry>
27     <term>replication cluster</term>
28     <listitem>
29      <simpara>
30       In the &repmgr; documentation, "replication cluster" refers to the network
31       of PostgreSQL servers connected by streaming replication.
32      </simpara>
33     </listitem>
34    </varlistentry>
35
36    <varlistentry>
37     <term>node</term>
38     <listitem>
39      <simpara>
40       A node is a single PostgreSQL server within a replication cluster.
41      </simpara>
42     </listitem>
43    </varlistentry>
44
45    <varlistentry>
46     <term>upstream node</term>
47     <listitem>
48      <simpara>
49       The node a standby server connects to, in order to receive streaming replication.
50       This is either the primary server, or in the case of cascading replication, another
51       standby.
52      </simpara>
53     </listitem>
54    </varlistentry>
55
56    <varlistentry>
57     <term>failover</term>
58     <listitem>
59      <simpara>
60       This is the action which occurs if a primary server fails and a suitable standby
61       is  promoted as the new primary. The &repmgrd; daemon supports automatic failover
62       to minimise downtime.
63      </simpara>
64     </listitem>
65    </varlistentry>
66
67    <varlistentry>
68     <term>switchover</term>
69     <listitem>
70      <simpara>
71       In certain circumstances, such as hardware or operating system maintenance,
72       it's necessary to take a primary server offline; in this case a controlled
73       switchover is necessary, whereby a suitable standby is promoted and the
74       existing primary removed from the replication cluster in a controlled manner.
75       The &repmgr; command line client provides this functionality.
76      </simpara>
77     </listitem>
78    </varlistentry>
79
80    <varlistentry>
81     <term>fencing</term>
82     <listitem>
83      <simpara>
84       In a failover situation, following the promotion of a new standby, it's
85       essential that the previous primary does not unexpectedly come back on
86       line, which would result in a split-brain situation. To prevent this,
87       the failed primary should be isolated from applications, i.e. "fenced off".
88      </simpara>
89     </listitem>
90    </varlistentry>
91   <varlistentry id="witness-server">
92     <term>witness server</term>
93     <listitem>
94      <para>
95        &repmgr; provides functionality to set up a so-called "witness server" to
96        assist in determining a new primary server in a failover situation with more
97        than one standby. The witness server itself is not part of the replication
98        cluster, although it does contain a copy of the repmgr metadata schema.
99      </para>
100      <para>
101        The purpose of a witness server is to provide a "casting vote" where servers
102        in the replication cluster are split over more than one location. In the event
103        of a loss of connectivity between locations, the presence or absence of
104        the witness server will decide whether a server at that location is promoted
105        to primary; this is to prevent a "split-brain" situation where an isolated
106        location interprets a network outage as a failure of the (remote) primary and
107        promotes a (local) standby.
108      </para>
109      <para>
110        A witness server only needs to be created if &repmgrd;
111        is in use.
112      </para>
113     </listitem>
114    </varlistentry>
115   </variablelist>
116  </para>
117 </sect1>
118 <sect1 id="repmgr-components" xreflabel="Components">
119  <title>Components</title>
120  <para>
121  &repmgr; is a suite of open-source tools to manage replication and failover
122  within a cluster of PostgreSQL servers. It supports and enhances PostgreSQL's
123  built-in streaming replication, which provides a single read/write primary server
124  and one or more read-only standbys containing near-real time copies of the primary
125  server's database. It provides two main tools:
126   <variablelist>
127    <varlistentry>
128     <term>repmgr</term>
129     <listitem>
130      <para>
131       A command-line tool used to perform administrative tasks such as:
132       <itemizedlist>
133        <listitem>
134          <simpara>setting up standby servers</simpara>
135        </listitem>
136        <listitem>
137          <simpara>promoting a standby server to primary</simpara>
138        </listitem>
139        <listitem>
140          <simpara>switching over primary and standby servers</simpara>
141        </listitem>
142        <listitem>
143          <simpara>displaying the status of servers in the replication cluster</simpara>
144        </listitem>
145       </itemizedlist>
146      </para>
147     </listitem>
148    </varlistentry>
149
150    <varlistentry>
151     <term>repmgrd</term>
152     <listitem>
153      <para>
154       A daemon which actively monitors servers in a replication cluster
155       and performs the following tasks:
156       <itemizedlist>
157        <listitem>
158          <simpara>monitoring and recording replication performance</simpara>
159        </listitem>
160        <listitem>
161          <simpara>performing failover by detecting failure of the primary and
162            promoting the most suitable standby server
163          </simpara>
164        </listitem>
165        <listitem>
166          <simpara>provide notifications about events in the cluster to a user-defined
167      script which can perform tasks such as sending alerts by email</simpara>
168        </listitem>
169       </itemizedlist>
170      </para>
171     </listitem>
172    </varlistentry>
173   </variablelist>
174  </para>
175 </sect1>
176
177 <sect1 id="repmgr-user-metadata" xreflabel="Repmgr user and metadata">
178  <title>Repmgr user and metadata</title>
179  <para>
180   In order to effectively manage a replication cluster, &repmgr; needs to store
181   information about the servers in the cluster in a dedicated database schema.
182   This schema is automatically created by the &repmgr; extension, which is installed
183   during the first step in initializing a &repmgr;-administered cluster
184   (<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
185   and contains the following objects:
186   <variablelist>
187    <varlistentry>
188     <term>Tables</term>
189     <listitem>
190      <para>
191       <itemizedlist>
192        <listitem>
193          <simpara><literal>repmgr.events</literal>: records events of interest</simpara>
194        </listitem>
195        <listitem>
196          <simpara><literal>repmgr.nodes</literal>: connection and status information for each server in the
197    replication cluster</simpara>
198        </listitem>
199        <listitem>
200          <simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
201            written by &repmgrd;</simpara>
202        </listitem>
203       </itemizedlist>
204      </para>
205     </listitem>
206    </varlistentry>
207    <varlistentry>
208     <term>Views</term>
209     <listitem>
210      <para>
211       <itemizedlist>
212        <listitem>
213          <simpara>repmgr.show_nodes: based on the table <literal>repmgr.nodes</literal>, additionally showing the
214           name of the server's upstream node</simpara>
215        </listitem>
216        <listitem>
217          <simpara>repmgr.replication_status: when &repmgrd;'s monitoring is enabled, shows
218            current monitoring status for each standby.</simpara>
219        </listitem>
220       </itemizedlist>
221      </para>
222     </listitem>
223    </varlistentry>
224   </variablelist>
225  </para>
226
227  <para>
228   The &repmgr; metadata schema can be stored in an existing database or in its own
229   dedicated database. Note that the &repmgr; metadata schema cannot reside on a database
230   server which is not part of the replication cluster managed by &repmgr;.
231  </para>
232  <para>
233   A database user must be available for &repmgr; to access this database and perform
234   necessary changes. This user does not need to be a superuser, however some operations
235   such as initial installation of the &repmgr; extension will require a superuser
236   connection (this can be specified where required with the command line option
237   <literal>--superuser</literal>).
238  </para>
239 </sect1>
240
241</chapter>
242