1<chapter id="cloning-standbys" xreflabel="cloning standbys"> 2 <title>Cloning standbys</title> 3 4 <sect1 id="cloning-from-barman" xreflabel="Cloning from Barman"> 5 <title>Cloning a standby from Barman</title> 6 7 <indexterm> 8 <primary>cloning</primary> 9 <secondary>from Barman</secondary> 10 </indexterm> 11 <indexterm> 12 <primary>Barman</primary> 13 <secondary>cloning a standby</secondary> 14 </indexterm> 15 16 <para> 17 <xref linkend="repmgr-standby-clone"/> can use 18 <ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s 19 <ulink url="https://www.pgbarman.org/">Barman</ulink> application 20 to clone a standby (and also as a fallback source for WAL files). 21 </para> 22 <tip> 23 <simpara> 24 Barman (aka PgBarman) should be considered as an integral part of any 25 PostgreSQL replication cluster. For more details see: 26 <ulink url="https://www.pgbarman.org/">https://www.pgbarman.org/</ulink>. 27 </simpara> 28 </tip> 29 <para> 30 Barman support provides the following advantages: 31 <itemizedlist spacing="compact" mark="bullet"> 32 <listitem> 33 <para> 34 the primary node does not need to perform a new backup every time a 35 new standby is cloned 36 </para> 37 </listitem> 38 <listitem> 39 <para> 40 a standby node can be disconnected for longer periods without losing 41 the ability to catch up, and without causing accumulation of WAL 42 files on the primary node 43 </para> 44 </listitem> 45 <listitem> 46 <para> 47 WAL management on the primary becomes much easier as there's no need 48 to use replication slots, and <varname>wal_keep_segments</varname> 49 (PostgreSQL 13 and later: <varname>wal_keep_size</varname>) 50 does not need to be set. 51 </para> 52 </listitem> 53 </itemizedlist> 54 </para> 55 56 57 <note> 58 <para> 59 Currently &repmgr;'s support for cloning from Barman is implemented by using 60 <productname>rsync</productname> to clone from the Barman server. 61 </para> 62 <para> 63 It is therefore not able to make use of Barman's parallel restore facility, which 64 is executed on the Barman server and clones to the target server. 65 </para> 66 <para> 67 Barman's parallel restore facility can be used by executing it manually on 68 the Barman server and configuring replication on the resulting cloned 69 standby using 70 <command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>. 71 </para> 72 </note> 73 74 75 <sect2 id="cloning-from-barman-prerequisites"> 76 <title>Prerequisites for cloning from Barman</title> 77 <para> 78 In order to enable Barman support for <command>repmgr standby clone</command>, following 79 prerequisites must be met: 80 <itemizedlist spacing="compact" mark="bullet"> 81 <listitem> 82 <para> 83 the Barman catalogue must include at least one valid backup for this server; 84 </para> 85 </listitem> 86 <listitem> 87 <para> 88 the <varname>barman_host</varname> setting in <filename>repmgr.conf</filename> is set to the SSH 89 hostname of the Barman server; 90 </para> 91 </listitem> 92 <listitem> 93 <para> 94 the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the 95 server configured in Barman. 96 </para> 97 </listitem> 98 99 </itemizedlist> 100 </para> 101 102 <para> 103 For example, assuming Barman is located on the host "<literal>barmansrv</literal>" 104 under the "<literal>barman</literal>" user account, 105 <filename>repmgr.conf</filename> should contain the following entries: 106 <programlisting> 107 barman_host='barman@barmansrv' 108 barman_server='pg'</programlisting> 109 </para> 110 <para> 111 Here <literal>pg</literal> corresponds to a section in Barman's configuration file for a specific 112 server backup configuration, which would look something like: 113 <programlisting> 114[pg] 115description = "Main cluster" 116... 117 </programlisting> 118 </para> 119 <para> 120 More details on Barman configuration can be found in the 121 <ulink url="https://docs.pgbarman.org/">Barman documentation</ulink>'s 122 <ulink url="https://docs.pgbarman.org/#configuration">configuration section</ulink>. 123 </para> 124 <note> 125 <para> 126 To use a non-default Barman configuration file on the Barman server, 127 specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>: 128 <programlisting> 129 barman_config='/path/to/barman.conf'</programlisting> 130 </para> 131 </note> 132 133 134 <para> 135 We also recommend configuring the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename> 136 to use the <command>barman-wal-restore</command> script 137 (see section <xref linkend="cloning-from-barman-restore-command"/> below). 138 </para> 139 140 141 <tip> 142 <simpara> 143 If you have a non-default SSH configuration on the Barman 144 server, e.g. using a port other than 22, then you can set those 145 parameters in a dedicated Host section in <filename>~/.ssh/config</filename> 146 corresponding to the value of <varname>barman_host</varname> in 147 <filename>repmgr.conf</filename>. See the <literal>Host</literal> 148 section in <command>man 5 ssh_config</command> for more details. 149 </simpara> 150 </tip> 151 <para> 152 If you wish to place WAL files in a location outside the main 153 PostgreSQL data directory, set <option>--waldir</option> 154 (PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in 155 <option>pg_basebackup_options</option> to the target directory 156 (must be an absolute filepath). &repmgr; will create and 157 symlink to this directory in exactly the same way 158 <application>pg_basebackup</application> would. 159 </para> 160 <para> 161 It's now possible to clone a standby from Barman, e.g.: 162 <programlisting> 163 $ repmgr -f /etc/repmgr.conf -h node1 -U repmgr -d repmgr standby clone 164 NOTICE: destination directory "/var/lib/postgresql/data" provided 165 INFO: connecting to Barman server to verify backup for "test_cluster" 166 INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data" 167 INFO: creating directory "/var/lib/postgresql/data/repmgr"... 168 INFO: connecting to Barman server to fetch server parameters 169 INFO: connecting to source node 170 DETAIL: current installation size is 30 MB 171 NOTICE: retrieving backup from Barman... 172 (...) 173 NOTICE: standby clone (from Barman) complete 174 NOTICE: you can now start your PostgreSQL server 175 HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting> 176 </para> 177 178 <note> 179 <simpara> 180 Barman support is automatically enabled if <varname>barman_server</varname> 181 is set. Normally it is good practice to use Barman, for instance 182 when fetching a base backup while cloning a standby; in any case, 183 Barman mode can be disabled using the <literal>--without-barman</literal> 184 command line option. 185 </simpara> 186 </note> 187 188 </sect2> 189 <sect2 id="cloning-from-barman-restore-command" xreflabel="Using Barman as a WAL file source"> 190 <title>Using Barman as a WAL file source</title> 191 192 <indexterm> 193 <primary>Barman</primary> 194 <secondary>fetching archived WAL</secondary> 195 </indexterm> 196 197 <para> 198 As a fallback in case streaming replication is interrupted, PostgreSQL can optionally 199 retrieve WAL files from an archive, such as that provided by Barman. This is done by 200 setting <varname>restore_command</varname> in <filename>recovery.conf</filename> to 201 a valid shell command which can retrieve a specified WAL file from the archive. 202 </para> 203 <para> 204 <command>barman-wal-restore</command> is a Python script provided as part of the <literal>barman-cli</literal> 205 package (Barman 2.0 ~ 2.7) or as part of the core Barman distribution (Barman 2.8 and later). 206 </para> 207 <para> 208 To use <command>barman-wal-restore</command> with &repmgr;, 209 assuming Barman is located on the host "<literal>barmansrv</literal>" 210 under the "<literal>barman</literal>" user account, 211 and that <command>barman-wal-restore</command> is located as an executable at 212 <filename>/usr/bin/barman-wal-restore</filename>, 213 <filename>repmgr.conf</filename> should include the following lines: 214 <programlisting> 215 barman_host='barman@barmansrv' 216 barman_server='pg' 217 restore_command='/usr/bin/barman-wal-restore barmansrv pg %f %p'</programlisting> 218 </para> 219 <note> 220 <simpara> 221 <command>barman-wal-restore</command> supports command line switches to 222 control parallelism (<literal>--parallel=N</literal>) and compression 223 (<literal>--bzip2</literal>, <literal>--gzip</literal>). 224 </simpara> 225 </note> 226 227 </sect2> 228 </sect1> 229 230 <sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots"> 231 <title>Cloning and replication slots</title> 232 233 <indexterm> 234 <primary>cloning</primary> 235 <secondary>replication slots</secondary> 236 </indexterm> 237 238 <indexterm> 239 <primary>replication slots</primary> 240 <secondary>cloning</secondary> 241 </indexterm> 242 <para> 243 Replication slots were introduced with PostgreSQL 9.4 and are designed to ensure 244 that any standby connected to the primary using a replication slot will always 245 be able to retrieve the required WAL files. This removes the need to manually 246 manage WAL file retention by estimating the number of WAL files that need to 247 be maintained on the primary using <varname>wal_keep_segments</varname> 248 (PostgreSQL 13 and later: <varname>wal_keep_size</varname>). 249 Do however be aware that if a standby is disconnected, WAL will continue to 250 accumulate on the primary until either the standby reconnects or the replication 251 slot is dropped. 252 </para> 253 <para> 254 To enable &repmgr; to use replication slots, set the boolean parameter 255 <varname>use_replication_slots</varname> in <filename>repmgr.conf</filename>: 256 <programlisting> 257 use_replication_slots=true</programlisting> 258 </para> 259 <para> 260 Replication slots must be enabled in <filename>postgresql.conf</filename> by 261 setting the parameter <varname>max_replication_slots</varname> to at least the 262 number of expected standbys (changes to this parameter require a server restart). 263 </para> 264 <para> 265 When cloning a standby, &repmgr; will automatically generate an appropriate 266 slot name, which is stored in the <literal>repmgr.nodes</literal> table, and create the slot 267 on the upstream node: 268 <programlisting> 269 repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name 270 FROM repmgr.nodes ORDER BY node_id; 271 node_id | upstream_node_id | active | node_name | type | priority | slot_name 272 ---------+------------------+--------+-----------+---------+----------+--------------- 273 1 | | t | node1 | primary | 100 | repmgr_slot_1 274 2 | 1 | t | node2 | standby | 100 | repmgr_slot_2 275 3 | 1 | t | node3 | standby | 100 | repmgr_slot_3 276 (3 rows)</programlisting> 277 278 <programlisting> 279 repmgr=# SELECT slot_name, slot_type, active, active_pid FROM pg_replication_slots ; 280 slot_name | slot_type | active | active_pid 281 ---------------+-----------+--------+------------ 282 repmgr_slot_2 | physical | t | 23658 283 repmgr_slot_3 | physical | t | 23687 284 (2 rows)</programlisting> 285 </para> 286 <para> 287 Note that a slot name will be created by default for the primary but not 288 actually used unless the primary is converted to a standby using e.g. 289 <command>repmgr standby switchover</command>. 290 </para> 291 <para> 292 Further information on replication slots in the PostgreSQL documentation: 293 <ulink url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS">https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS</ulink> 294 </para> 295 <tip> 296 <simpara> 297 While replication slots can be useful for streaming replication, it's 298 recommended to monitor for inactive slots as these will cause WAL files to 299 build up indefinitely, possibly leading to server failure. 300 </simpara> 301 <simpara> 302 As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>, 303 which offloads WAL management to a separate server, removing the requirement to use a replication 304 slot for each individual standby to reserve WAL. See section <xref linkend="cloning-from-barman"/> 305 for more details on using &repmgr; together with Barman. 306 </simpara> 307 </tip> 308 </sect1> 309 310 <sect1 id="cloning-cascading" xreflabel="Cloning and cascading replication"> 311 <title>Cloning and cascading replication</title> 312 313 <indexterm> 314 <primary>cloning</primary> 315 <secondary>cascading replication</secondary> 316 </indexterm> 317 318 <para> 319 Cascading replication, introduced with PostgreSQL 9.2, enables a standby server 320 to replicate from another standby server rather than directly from the primary, 321 meaning replication changes "cascade" down through a hierarchy of servers. This 322 can be used to reduce load on the primary and minimize bandwith usage between 323 sites. For more details, see the 324 <ulink url="https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION"> 325 PostgreSQL cascading replication documentation</ulink>. 326 </para> 327 <para> 328 &repmgr; supports cascading replication. When cloning a standby, 329 set the command-line parameter <literal>--upstream-node-id</literal> to the 330 <varname>node_id</varname> of the server the standby should connect to, and 331 &repmgr; will create <filename>recovery.conf</filename> to point to it. Note 332 that if <literal>--upstream-node-id</literal> is not explicitly provided, 333 &repmgr; will set the standby's <filename>recovery.conf</filename> to 334 point to the primary node. 335 </para> 336 <para> 337 To demonstrate cascading replication, first ensure you have a primary and standby 338 set up as shown in the <xref linkend="quickstart"/>. 339 Then create an additional standby server with <filename>repmgr.conf</filename> looking 340 like this: 341 <programlisting> 342 node_id=3 343 node_name=node3 344 conninfo='host=node3 user=repmgr dbname=repmgr' 345 data_directory='/var/lib/postgresql/data'</programlisting> 346 </para> 347 <para> 348 Clone this standby (using the connection parameters for the existing standby), 349 ensuring <literal>--upstream-node-id</literal> is provide with the <varname>node_id</varname> 350 of the previously created standby (if following the example, this will be <literal>2</literal>): 351 <programlisting> 352 $ repmgr -h node2 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=2 353 NOTICE: using configuration file "/etc/repmgr.conf" 354 NOTICE: destination directory "/var/lib/postgresql/data" provided 355 INFO: connecting to upstream node 356 INFO: connected to source node, checking its state 357 NOTICE: checking for available walsenders on upstream node (2 required) 358 INFO: sufficient walsenders available on upstream node (2 required) 359 INFO: successfully connected to source node 360 DETAIL: current installation size is 29 MB 361 INFO: creating directory "/var/lib/postgresql/data"... 362 NOTICE: starting backup (using pg_basebackup)... 363 HINT: this may take some time; consider using the -c/--fast-checkpoint option 364 INFO: executing: 'pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node2 -U repmgr -X stream ' 365 NOTICE: standby clone (using pg_basebackup) complete 366 NOTICE: you can now start your PostgreSQL server 367 HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting> 368 369 then register it (note that <literal>--upstream-node-id</literal> must be provided here 370 too): 371 <programlisting> 372 $ repmgr -f /etc/repmgr.conf standby register --upstream-node-id=2 373 NOTICE: standby node "node2" (ID: 2) successfully registered 374 </programlisting> 375 </para> 376 <para> 377 After starting the standby, the cluster will look like this, showing that <literal>node3</literal> 378 is attached to <literal>node2</literal>, not the primary (<literal>node1</literal>). 379 <programlisting> 380 $ repmgr -f /etc/repmgr.conf cluster show 381 ID | Name | Role | Status | Upstream | Location | Connection string 382 ----+-------+---------+-----------+----------+----------+-------------------------------------- 383 1 | node1 | primary | * running | | default | host=node1 dbname=repmgr user=repmgr 384 2 | node2 | standby | running | node1 | default | host=node2 dbname=repmgr user=repmgr 385 3 | node3 | standby | running | node2 | default | host=node3 dbname=repmgr user=repmgr 386 </programlisting> 387 </para> 388 <tip> 389 <simpara> 390 Under some circumstances when setting up a cascading replication 391 cluster, you may wish to clone a downstream standby whose upstream node 392 does not yet exist. In this case you can clone from the primary (or 393 another upstream node); provide the parameter <literal>--upstream-conninfo</literal> 394 to explictly set the upstream's <varname>primary_conninfo</varname> string 395 in <filename>recovery.conf</filename>. 396 </simpara> 397 </tip> 398 </sect1> 399 400 <sect1 id="cloning-advanced" xreflabel="Advanced cloning options"> 401 <title>Advanced cloning options</title> 402 <indexterm> 403 <primary>cloning</primary> 404 <secondary>advanced options</secondary> 405 </indexterm> 406 407 <sect2 id="cloning-advanced-pg-basebackup-options" xreflabel="pg_basebackup options when cloning a standby"> 408 <title>pg_basebackup options when cloning a standby</title> 409 <para> 410 As &repmgr; uses <command>pg_basebackup</command> to clone a standby, it's possible to 411 provide additional parameters for <command>pg_basebackup</command> to customise the 412 cloning process. 413 </para> 414 415 <para> 416 By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup 417 process. However, a normal checkpoint may take some time to complete; 418 a fast checkpoint can be forced with <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>'s 419 <literal>-c/--fast-checkpoint</literal> option. 420 Note that this may impact performance of the server being cloned from (typically the primary) 421 so should be used with care. 422 </para> 423 <tip> 424 <simpara> 425 If <application>Barman</application> is set up for the cluster, it's possible to 426 clone the standby directly from Barman, without any impact on the server the standby 427 is being cloned from. For more details see <xref linkend="cloning-from-barman"/>. 428 </simpara> 429 </tip> 430 <para> 431 Other options can be passed to <command>pg_basebackup</command> by including them 432 in the <filename>repmgr.conf</filename> setting <varname>pg_basebackup_options</varname>. 433 </para> 434 435 <para> 436 Not that by default, &repmgr; executes <command>pg_basebackup</command> with <option>-X/--wal-method</option> 437 (PostgreSQL 9.6 and earlier: <option>-X/--xlog-method</option>) set to <literal>stream</literal>. 438 From PostgreSQL 9.6, if replication slots are in use, it will also create a replication slot before 439 running the base backup, and execute <command>pg_basebackup</command> with the 440 <option>-S/--slot</option> option set to the name of the previously created replication slot. 441 </para> 442 <para> 443 These parameters can set by the user in <varname>pg_basebackup_options</varname>, in which case they 444 will override the &repmgr; default values. However normally there's no reason to do this. 445 </para> 446 <para> 447 If using a separate directory to store WAL files, provide the option <literal>--waldir</literal> 448 (<literal>--xlogdir</literal> in PostgreSQL 9.6 and earlier) with the absolute path to the 449 WAL directory. Any WALs generated during the cloning process will be copied here, and 450 a symlink will automatically be created from the main data directory. 451 </para> 452 <tip> 453 <para> 454 The <literal>--waldir</literal> (<literal>--xlogdir</literal>) option, 455 if present in <varname>pg_basebackup_options</varname>, will be honoured by &repmgr; 456 when cloning from Barman (&repmgr; 5.2 and later). 457 </para> 458 </tip> 459 <para> 460 See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink> 461 for more details of available options. 462 </para> 463 </sect2> 464 465 <sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords"> 466 <title>Managing passwords</title> 467 <indexterm> 468 <primary>cloning</primary> 469 <secondary>using passwords</secondary> 470 </indexterm> 471 472 <para> 473 If replication connections to a standby's upstream server are password-protected, 474 the standby must be able to provide the password so it can begin streaming replication. 475 </para> 476 477 <para> 478 The recommended way to do this is to store the password in the <literal>postgres</literal> system 479 user's <filename>~/.pgpass</filename> file. For more information on using the password file, see 480 the documentation section <xref linkend="configuration-password-file"/>. 481 </para> 482 483 <note> 484 <para> 485 If using a <filename>pgpass</filename> file, an entry for the replication user (by default the 486 user who connects to the <literal>repmgr</literal> database) <emphasis>must</emphasis> 487 be provided, with database name set to <literal>replication</literal>, e.g.: 488 <programlisting> 489 node1:5432:replication:repmgr:12345</programlisting> 490 </para> 491 </note> 492 493 <para> 494 If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>, 495 set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in 496 <filename>repmgr.conf</filename>. This will read a password set in <varname>PGPASSWORD</varname> 497 (but not <filename>~/.pgpass</filename>) and place it into the <varname>primary_conninfo</varname> 498 string in <filename>recovery.conf</filename>. Note that <varname>PGPASSWORD</varname> 499 will need to be set during any action which causes <filename>recovery.conf</filename> to be 500 rewritten, e.g. <xref linkend="repmgr-standby-follow"/>. 501 </para> 502 </sect2> 503 504 <sect2 id="cloning-advanced-replication-user" xreflabel="Separate replication user"> 505 <title>Separate replication user</title> 506 <para> 507 In some circumstances it might be desirable to create a dedicated replication-only 508 user (in addition to the user who manages the &repmgr; metadata). In this case, 509 the replication user should be set in <filename>repmgr.conf</filename> via the parameter 510 <varname>replication_user</varname>; &repmgr; will use this value when making 511 replication connections and generating <filename>recovery.conf</filename>. This 512 value will also be stored in the parameter <literal>repmgr.nodes</literal> 513 table for each node; it no longer needs to be explicitly specified when 514 cloning a node or executing <xref linkend="repmgr-standby-follow"/>. 515 </para> 516 </sect2> 517 518 519 <sect2 id="cloning-advanced-tablespace-mapping" xreflabel="Tablespace mapping"> 520 <title>Tablespace mapping</title> 521 <indexterm> 522 <primary>tablespace mapping</primary> 523 </indexterm> 524 <para> 525 &repmgr; provides a <option>tablespace_mapping</option> configuration 526 file option, which will makes it possible to map the tablespace on the source node to 527 a different location on the local node. 528 </para> 529 <para> 530 To use this, add <option>tablespace_mapping</option> to <filename>repmgr.conf</filename> 531 like this: 532<programlisting> 533 tablespace_mapping='/var/lib/pgsql/tblspc1=/data/pgsql/tblspc1' 534</programlisting> 535 </para> 536 <para> 537 where the left-hand value represents the tablespace on the source node, 538 and the right-hand value represents the tablespace on the standby to be cloned. 539 </para> 540 <para> 541 This parameter can be provided multiple times. 542 </para> 543 </sect2> 544 545 </sect1> 546 547 548</chapter> 549