1<chapter id="cloning-standbys" xreflabel="cloning standbys">
2 <title>Cloning standbys</title>
3
4 <sect1 id="cloning-from-barman" xreflabel="Cloning from Barman">
5   <title>Cloning a standby from Barman</title>
6
7   <indexterm>
8    <primary>cloning</primary>
9    <secondary>from Barman</secondary>
10   </indexterm>
11   <indexterm>
12    <primary>Barman</primary>
13    <secondary>cloning a standby</secondary>
14   </indexterm>
15
16   <para>
17    <xref linkend="repmgr-standby-clone"/> can use
18    <ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s
19    <ulink url="https://www.pgbarman.org/">Barman</ulink> application
20    to clone a standby (and also as a fallback source for WAL files).
21   </para>
22   <tip>
23    <simpara>
24     Barman (aka PgBarman) should be considered as an integral part of any
25     PostgreSQL replication cluster. For more details see:
26     <ulink url="https://www.pgbarman.org/">https://www.pgbarman.org/</ulink>.
27    </simpara>
28   </tip>
29   <para>
30    Barman support provides the following advantages:
31    <itemizedlist spacing="compact" mark="bullet">
32     <listitem>
33      <para>
34       the primary node does not need to perform a new backup every time a
35       new standby is cloned
36      </para>
37     </listitem>
38     <listitem>
39      <para>
40       a standby node can be disconnected for longer periods without losing
41       the ability to catch up, and without causing accumulation of WAL
42       files on the primary node
43      </para>
44     </listitem>
45     <listitem>
46      <para>
47       WAL management on the primary becomes much easier as there's no need
48       to use replication slots, and <varname>wal_keep_segments</varname>
49       (PostgreSQL 13 and later: <varname>wal_keep_size</varname>)
50       does not need to be set.
51     </para>
52    </listitem>
53   </itemizedlist>
54   </para>
55
56
57   <note>
58    <para>
59      Currently &repmgr;'s support for cloning from Barman is implemented by using
60      <productname>rsync</productname> to clone from the Barman server.
61    </para>
62    <para>
63      It is therefore not able to make use of Barman's parallel restore facility, which
64      is executed on the Barman server and clones to the target server.
65    </para>
66    <para>
67      Barman's parallel restore facility can be used by executing it manually on
68      the Barman server and configuring replication on the resulting cloned
69      standby using
70      <command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>.
71    </para>
72   </note>
73
74
75  <sect2 id="cloning-from-barman-prerequisites">
76   <title>Prerequisites for cloning from Barman</title>
77   <para>
78    In order to enable Barman support for <command>repmgr standby clone</command>, following
79    prerequisites must be met:
80   <itemizedlist spacing="compact" mark="bullet">
81     <listitem>
82      <para>
83        the Barman catalogue must include at least one valid backup for this server;
84      </para>
85     </listitem>
86     <listitem>
87      <para>
88        the <varname>barman_host</varname> setting in <filename>repmgr.conf</filename> is set to the SSH
89        hostname of the Barman server;
90      </para>
91     </listitem>
92     <listitem>
93      <para>
94        the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the
95        server configured in Barman.
96      </para>
97     </listitem>
98
99   </itemizedlist>
100   </para>
101
102   <para>
103     For example, assuming Barman is located on the host &quot;<literal>barmansrv</literal>&quot;
104     under the &quot;<literal>barman</literal>&quot; user account,
105     <filename>repmgr.conf</filename> should contain the following entries:
106    <programlisting>
107    barman_host='barman@barmansrv'
108    barman_server='pg'</programlisting>
109   </para>
110   <para>
111     Here <literal>pg</literal> corresponds to a section in Barman's configuration file for a specific
112     server backup configuration, which would look something like:
113     <programlisting>
114[pg]
115description = "Main cluster"
116...
117     </programlisting>
118   </para>
119   <para>
120     More details on Barman configuration can be found in the
121     <ulink url="https://docs.pgbarman.org/">Barman documentation</ulink>'s
122     <ulink url="https://docs.pgbarman.org/#configuration">configuration section</ulink>.
123   </para>
124   <note>
125    <para>
126     To use a non-default Barman configuration file on the Barman server,
127     specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
128     <programlisting>
129      barman_config='/path/to/barman.conf'</programlisting>
130    </para>
131   </note>
132
133
134   <para>
135     We also recommend configuring the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename>
136     to use the <command>barman-wal-restore</command> script
137     (see section <xref linkend="cloning-from-barman-restore-command"/> below).
138   </para>
139
140
141   <tip>
142    <simpara>
143      If you have a non-default SSH configuration on the Barman
144      server, e.g. using a port other than 22, then you can set those
145      parameters in a dedicated Host section in <filename>~/.ssh/config</filename>
146      corresponding to the value of <varname>barman_host</varname> in
147      <filename>repmgr.conf</filename>. See the <literal>Host</literal>
148      section in <command>man 5 ssh_config</command> for more details.
149    </simpara>
150   </tip>
151   <para>
152     If you wish to place WAL files in a location outside the main
153     PostgreSQL data directory, set <option>--waldir</option>
154     (PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
155     <option>pg_basebackup_options</option> to the target directory
156     (must be an absolute filepath). &repmgr; will create and
157     symlink to this directory in exactly the same way
158     <application>pg_basebackup</application> would.
159   </para>
160   <para>
161    It's now possible to clone a standby from Barman, e.g.:
162    <programlisting>
163    $ repmgr -f /etc/repmgr.conf -h node1 -U repmgr -d repmgr standby clone
164    NOTICE: destination directory "/var/lib/postgresql/data" provided
165    INFO: connecting to Barman server to verify backup for "test_cluster"
166    INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data"
167    INFO: creating directory "/var/lib/postgresql/data/repmgr"...
168    INFO: connecting to Barman server to fetch server parameters
169    INFO: connecting to source node
170    DETAIL: current installation size is 30 MB
171    NOTICE: retrieving backup from Barman...
172    (...)
173    NOTICE: standby clone (from Barman) complete
174    NOTICE: you can now start your PostgreSQL server
175    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
176   </para>
177
178   <note>
179    <simpara>
180     Barman support is automatically enabled if <varname>barman_server</varname>
181     is set. Normally it is good practice to use Barman, for instance
182     when fetching a base backup while cloning a standby; in any case,
183     Barman mode can be disabled using the <literal>--without-barman</literal>
184     command line option.
185    </simpara>
186   </note>
187
188  </sect2>
189  <sect2 id="cloning-from-barman-restore-command" xreflabel="Using Barman as a WAL file source">
190   <title>Using Barman as a WAL file source</title>
191
192   <indexterm>
193    <primary>Barman</primary>
194    <secondary>fetching archived WAL</secondary>
195   </indexterm>
196
197   <para>
198    As a fallback in case streaming replication is interrupted, PostgreSQL can optionally
199    retrieve WAL files from an archive, such as that provided by Barman. This is done by
200    setting <varname>restore_command</varname> in <filename>recovery.conf</filename> to
201    a valid shell command which can retrieve a specified WAL file from the archive.
202   </para>
203   <para>
204     <command>barman-wal-restore</command> is a Python script provided as part of the <literal>barman-cli</literal>
205     package (Barman 2.0 ~ 2.7) or as part of the core Barman distribution (Barman 2.8 and later).
206   </para>
207   <para>
208    To use <command>barman-wal-restore</command> with &repmgr;,
209    assuming Barman is located on the host &quot;<literal>barmansrv</literal>&quot;
210    under the &quot;<literal>barman</literal>&quot; user account,
211    and that <command>barman-wal-restore</command> is located as an executable at
212    <filename>/usr/bin/barman-wal-restore</filename>,
213    <filename>repmgr.conf</filename> should include the following lines:
214    <programlisting>
215    barman_host='barman@barmansrv'
216    barman_server='pg'
217    restore_command='/usr/bin/barman-wal-restore barmansrv pg %f %p'</programlisting>
218   </para>
219   <note>
220    <simpara>
221      <command>barman-wal-restore</command> supports command line switches to
222      control parallelism (<literal>--parallel=N</literal>) and compression
223      (<literal>--bzip2</literal>, <literal>--gzip</literal>).
224    </simpara>
225   </note>
226
227  </sect2>
228 </sect1>
229
230 <sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots">
231   <title>Cloning and replication slots</title>
232
233   <indexterm>
234     <primary>cloning</primary>
235     <secondary>replication slots</secondary>
236   </indexterm>
237
238   <indexterm>
239     <primary>replication slots</primary>
240     <secondary>cloning</secondary>
241   </indexterm>
242   <para>
243    Replication slots were introduced with PostgreSQL 9.4 and are designed to ensure
244    that any standby connected to the primary using a replication slot will always
245    be able to retrieve the required WAL files. This removes the need to manually
246    manage WAL file retention by estimating the number of WAL files that need to
247    be maintained on the primary using <varname>wal_keep_segments</varname>
248    (PostgreSQL 13 and later: <varname>wal_keep_size</varname>).
249    Do however be aware that if a standby is disconnected, WAL will continue to
250    accumulate on the primary until either the standby reconnects or the replication
251    slot is dropped.
252   </para>
253   <para>
254     To enable &repmgr; to use replication slots, set the boolean parameter
255     <varname>use_replication_slots</varname> in <filename>repmgr.conf</filename>:
256     <programlisting>
257       use_replication_slots=true</programlisting>
258   </para>
259   <para>
260    Replication slots must be enabled in <filename>postgresql.conf</filename> by
261    setting the parameter <varname>max_replication_slots</varname> to at least the
262    number of expected standbys (changes to this parameter require a server restart).
263   </para>
264   <para>
265    When cloning a standby, &repmgr; will automatically generate an appropriate
266    slot name, which is stored in the <literal>repmgr.nodes</literal> table, and create the slot
267    on the upstream node:
268     <programlisting>
269    repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name
270               FROM repmgr.nodes ORDER BY node_id;
271     node_id | upstream_node_id | active | node_name |  type   | priority |   slot_name
272    ---------+------------------+--------+-----------+---------+----------+---------------
273           1 |                  | t      | node1     | primary |      100 | repmgr_slot_1
274           2 |                1 | t      | node2     | standby |      100 | repmgr_slot_2
275           3 |                1 | t      | node3     | standby |      100 | repmgr_slot_3
276     (3 rows)</programlisting>
277
278    <programlisting>
279    repmgr=# SELECT slot_name, slot_type, active, active_pid FROM pg_replication_slots ;
280       slot_name   | slot_type | active | active_pid
281    ---------------+-----------+--------+------------
282     repmgr_slot_2 | physical  | t      |      23658
283     repmgr_slot_3 | physical  | t      |      23687
284    (2 rows)</programlisting>
285   </para>
286   <para>
287    Note that a slot name will be created by default for the primary but not
288    actually used unless the primary is converted to a standby using e.g.
289    <command>repmgr standby switchover</command>.
290   </para>
291   <para>
292    Further information on replication slots in the PostgreSQL documentation:
293    <ulink url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS">https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS</ulink>
294   </para>
295   <tip>
296    <simpara>
297     While replication slots can be useful for streaming replication, it's
298     recommended to monitor for inactive slots as these will cause WAL files to
299     build up indefinitely, possibly leading to server failure.
300    </simpara>
301    <simpara>
302     As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
303     which offloads WAL management to a separate server, removing the requirement to use a replication
304     slot for each individual standby to reserve WAL. See section <xref linkend="cloning-from-barman"/>
305     for more details on using &repmgr; together with Barman.
306    </simpara>
307   </tip>
308 </sect1>
309
310 <sect1 id="cloning-cascading" xreflabel="Cloning and cascading replication">
311   <title>Cloning and cascading replication</title>
312
313   <indexterm>
314     <primary>cloning</primary>
315     <secondary>cascading replication</secondary>
316   </indexterm>
317
318   <para>
319    Cascading replication, introduced with PostgreSQL 9.2, enables a standby server
320    to replicate from another standby server rather than directly from the primary,
321    meaning replication changes "cascade" down through a hierarchy of servers. This
322    can be used to reduce load on the primary and minimize bandwith usage between
323    sites. For more details, see the
324    <ulink url="https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION">
325    PostgreSQL cascading replication documentation</ulink>.
326   </para>
327   <para>
328    &repmgr; supports cascading replication. When cloning a standby,
329    set the command-line parameter <literal>--upstream-node-id</literal> to the
330    <varname>node_id</varname> of the server the standby should connect to, and
331    &repmgr; will create <filename>recovery.conf</filename> to point to it. Note
332    that if <literal>--upstream-node-id</literal> is not explicitly provided,
333    &repmgr; will set the standby's <filename>recovery.conf</filename> to
334    point to the primary node.
335   </para>
336   <para>
337    To demonstrate cascading replication, first ensure you have a primary and standby
338    set up as shown in the <xref linkend="quickstart"/>.
339    Then create an additional standby server with <filename>repmgr.conf</filename> looking
340    like this:
341    <programlisting>
342    node_id=3
343    node_name=node3
344    conninfo='host=node3 user=repmgr dbname=repmgr'
345    data_directory='/var/lib/postgresql/data'</programlisting>
346   </para>
347   <para>
348    Clone this standby (using the connection parameters for the existing standby),
349    ensuring <literal>--upstream-node-id</literal> is provide with the <varname>node_id</varname>
350    of the previously created standby (if following the example, this will be <literal>2</literal>):
351    <programlisting>
352    $ repmgr -h node2 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=2
353    NOTICE: using configuration file "/etc/repmgr.conf"
354    NOTICE: destination directory "/var/lib/postgresql/data" provided
355    INFO: connecting to upstream node
356    INFO: connected to source node, checking its state
357    NOTICE: checking for available walsenders on upstream node (2 required)
358    INFO: sufficient walsenders available on upstream node (2 required)
359    INFO: successfully connected to source node
360    DETAIL: current installation size is 29 MB
361    INFO: creating directory "/var/lib/postgresql/data"...
362    NOTICE: starting backup (using pg_basebackup)...
363    HINT: this may take some time; consider using the -c/--fast-checkpoint option
364    INFO: executing: 'pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node2 -U repmgr -X stream '
365    NOTICE: standby clone (using pg_basebackup) complete
366    NOTICE: you can now start your PostgreSQL server
367    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
368
369    then register it (note that <literal>--upstream-node-id</literal> must be provided here
370    too):
371    <programlisting>
372     $ repmgr -f /etc/repmgr.conf standby register --upstream-node-id=2
373     NOTICE: standby node "node2" (ID: 2) successfully registered
374    </programlisting>
375   </para>
376   <para>
377    After starting the standby, the cluster will look like this, showing that <literal>node3</literal>
378    is attached to <literal>node2</literal>, not the primary (<literal>node1</literal>).
379    <programlisting>
380    $ repmgr -f /etc/repmgr.conf cluster show
381     ID | Name  | Role    | Status    | Upstream | Location | Connection string
382    ----+-------+---------+-----------+----------+----------+--------------------------------------
383     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
384     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
385     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr
386    </programlisting>
387   </para>
388   <tip>
389    <simpara>
390     Under some circumstances when setting up a cascading replication
391     cluster, you may wish to clone a downstream standby whose upstream node
392     does not yet exist. In this case you can clone from the primary (or
393     another upstream node); provide the parameter <literal>--upstream-conninfo</literal>
394     to explictly set the upstream's <varname>primary_conninfo</varname> string
395     in <filename>recovery.conf</filename>.
396    </simpara>
397   </tip>
398 </sect1>
399
400 <sect1 id="cloning-advanced" xreflabel="Advanced cloning options">
401   <title>Advanced cloning options</title>
402   <indexterm>
403     <primary>cloning</primary>
404     <secondary>advanced options</secondary>
405   </indexterm>
406
407   <sect2 id="cloning-advanced-pg-basebackup-options" xreflabel="pg_basebackup options when cloning a standby">
408    <title>pg_basebackup options when cloning a standby</title>
409    <para>
410      As &repmgr; uses <command>pg_basebackup</command> to clone a standby, it's possible to
411      provide additional parameters for <command>pg_basebackup</command> to customise the
412      cloning process.
413    </para>
414
415    <para>
416     By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup
417     process. However, a normal checkpoint may take some time to complete;
418     a fast checkpoint can be forced with <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>'s
419     <literal>-c/--fast-checkpoint</literal> option.
420     Note that this may impact performance of the server being cloned from (typically the primary)
421     so should be used with care.
422    </para>
423    <tip>
424      <simpara>
425        If <application>Barman</application> is set up for the cluster, it's possible to
426        clone the standby directly from Barman, without any impact on the server the standby
427        is being cloned from. For more details see <xref linkend="cloning-from-barman"/>.
428      </simpara>
429    </tip>
430    <para>
431      Other options can be passed to <command>pg_basebackup</command> by including them
432      in the <filename>repmgr.conf</filename> setting <varname>pg_basebackup_options</varname>.
433    </para>
434
435    <para>
436      Not that by default, &repmgr; executes <command>pg_basebackup</command> with <option>-X/--wal-method</option>
437      (PostgreSQL 9.6 and earlier: <option>-X/--xlog-method</option>) set to <literal>stream</literal>.
438      From PostgreSQL 9.6, if replication slots are in use, it will also create a replication slot before
439      running the base backup, and execute <command>pg_basebackup</command> with the
440      <option>-S/--slot</option> option set to the name of the previously created replication slot.
441    </para>
442    <para>
443      These parameters can set by the user in <varname>pg_basebackup_options</varname>, in which case they
444      will override the &repmgr; default values. However normally there's no reason to do this.
445    </para>
446    <para>
447      If using a separate directory to store WAL files, provide the option <literal>--waldir</literal>
448      (<literal>--xlogdir</literal> in PostgreSQL 9.6 and earlier) with the absolute path to the
449      WAL directory. Any WALs generated during the cloning process will be copied here, and
450      a symlink will automatically be created from the main data directory.
451    </para>
452    <tip>
453      <para>
454        The <literal>--waldir</literal> (<literal>--xlogdir</literal>) option,
455        if present in <varname>pg_basebackup_options</varname>, will be honoured by &repmgr;
456        when cloning from Barman (&repmgr; 5.2 and later).
457      </para>
458    </tip>
459    <para>
460     See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
461     for more details of available options.
462    </para>
463   </sect2>
464
465   <sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords">
466    <title>Managing passwords</title>
467    <indexterm>
468      <primary>cloning</primary>
469      <secondary>using passwords</secondary>
470    </indexterm>
471
472    <para>
473     If replication connections to a standby's upstream server are password-protected,
474     the standby must be able to provide the password so it can begin streaming replication.
475    </para>
476
477    <para>
478     The recommended way to do this is to store the password in the <literal>postgres</literal> system
479     user's <filename>~/.pgpass</filename> file.  For more information on using the password file, see
480     the documentation section <xref linkend="configuration-password-file"/>.
481    </para>
482
483    <note>
484      <para>
485        If using a <filename>pgpass</filename> file, an entry for the replication user (by default the
486        user who connects to the <literal>repmgr</literal> database) <emphasis>must</emphasis>
487        be provided, with database name set to <literal>replication</literal>, e.g.:
488        <programlisting>
489          node1:5432:replication:repmgr:12345</programlisting>
490      </para>
491    </note>
492
493    <para>
494     If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>,
495     set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in
496     <filename>repmgr.conf</filename>. This will read a password set in <varname>PGPASSWORD</varname>
497     (but not <filename>~/.pgpass</filename>) and place it into the <varname>primary_conninfo</varname>
498     string in <filename>recovery.conf</filename>. Note that <varname>PGPASSWORD</varname>
499     will need to be set during any action which causes <filename>recovery.conf</filename> to be
500     rewritten, e.g. <xref linkend="repmgr-standby-follow"/>.
501    </para>
502   </sect2>
503
504   <sect2 id="cloning-advanced-replication-user" xreflabel="Separate replication user">
505    <title>Separate replication user</title>
506    <para>
507     In some circumstances it might be desirable to create a dedicated replication-only
508     user (in addition to the user who manages the &repmgr; metadata). In this case,
509     the replication user should be set in <filename>repmgr.conf</filename> via the parameter
510     <varname>replication_user</varname>; &repmgr; will use this value when making
511     replication connections and generating <filename>recovery.conf</filename>. This
512     value will also be stored in the parameter <literal>repmgr.nodes</literal>
513     table for each node; it no longer needs to be explicitly specified when
514     cloning a node or executing <xref linkend="repmgr-standby-follow"/>.
515    </para>
516   </sect2>
517
518
519   <sect2 id="cloning-advanced-tablespace-mapping" xreflabel="Tablespace mapping">
520    <title>Tablespace mapping</title>
521    <indexterm>
522      <primary>tablespace mapping</primary>
523    </indexterm>
524    <para>
525      &repmgr; provides a <option>tablespace_mapping</option> configuration
526      file option, which will makes it possible to map the tablespace on the source node to
527      a different location on the local node.
528    </para>
529    <para>
530      To use this, add <option>tablespace_mapping</option> to <filename>repmgr.conf</filename>
531      like this:
532<programlisting>
533  tablespace_mapping='/var/lib/pgsql/tblspc1=/data/pgsql/tblspc1'
534</programlisting>
535    </para>
536    <para>
537      where the left-hand value represents the tablespace on the source node,
538      and the right-hand value represents  the tablespace on the standby to be cloned.
539    </para>
540    <para>
541      This parameter can be provided multiple times.
542    </para>
543   </sect2>
544
545 </sect1>
546
547
548</chapter>
549