1<appendix id="appendix-faq" xreflabel="FAQ"> 2 3 <title>FAQ (Frequently Asked Questions)</title> 4 5 <indexterm> 6 <primary>FAQ (Frequently Asked Questions)</primary> 7 </indexterm> 8 9 <sect1 id="faq-general" xreflabel="General"> 10 <title>General</title> 11 12 <sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences"> 13 <title>What's the difference between the repmgr versions?</title> 14 <para> 15 &repmgr; 4 is a complete rewrite of the previous &repmgr; code base 16 and implements &repmgr; as a PostgreSQL extension. It 17 supports all PostgreSQL versions from 9.3 (although some &repmgr; 18 features are not available for PostgreSQL 9.3 and 9.4). 19 </para> 20 <note> 21 <para> 22 &repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides 23 support for the revised replication configuration mechanism in PostgreSQL 12. 24 </para> 25 <para> 26 Support for PostgreSQL 9.3 is no longer available from &repmgr; 5.2. 27 </para> 28 </note> 29 <para> 30 &repmgr; 3.x builds on the improved replication facilities added 31 in PostgreSQL 9.3, as well as improved automated failover support 32 via &repmgrd;, and is not compatible with PostgreSQL 9.2 33 and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x 34 series is no longer maintained. 35 </para> 36 <para> 37 &repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible 38 with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is 39 no longer maintained. 40 </para> 41 <para> 42 See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link> 43 and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>. 44 </para> 45 </sect2> 46 47 <sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots"> 48 <title>What's the advantage of using replication slots?</title> 49 <para> 50 Replication slots, introduced in PostgreSQL 9.4, ensure that the 51 primary server will retain WAL files until they have been consumed 52 by all standby servers. This means standby servers should never 53 fail due to not being able to retrieve required WAL files from the 54 primary. 55 </para> 56 <para> 57 However this does mean that if a standby is no longer connected to the 58 primary, the presence of the replication slot will cause WAL files 59 to be retained indefinitely, and eventually lead to disk space 60 exhaustion. 61 </para> 62 63 <tip> 64 <para> 65 2ndQuadrant's recommended configuration is to configure 66 <ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback 67 source of WAL files, rather than maintain replication slots for 68 each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>. 69 </para> 70 </tip> 71 </sect2> 72 73 <sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots"> 74 <title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title> 75 <para> 76 Normally at least same number as the number of standbys which will connect 77 to the node. Note that changes to <varname>max_replication_slots</varname> require a server 78 restart to take effect, and as there is no particular penalty for unused 79 replication slots, setting a higher figure will make adding new nodes 80 easier. 81 </para> 82 </sect2> 83 84 <sect2 id="faq-hash-index" xreflabel="Hash indexes"> 85 <title>Does &repmgr; support hash indexes?</title> 86 <para> 87 Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable 88 for use in streaming replication in PostgreSQL 9.6 and earlier. See the 89 <ulink url="https://www.postgresql.org/docs/9.6/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink> 90 for details. 91 </para> 92 <para> 93 From PostgreSQL 10, this restriction has been lifted and hash indexes can be used 94 in a streaming replication cluster. 95 </para> 96 </sect2> 97 98 <sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr"> 99 <title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title> 100 <para> 101 For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common 102 approach is to upgrade a standby to the latest version, perform a 103 <link linkend="performing-switchover">switchover</link> promoting it to a primary, 104 then upgrade the former primary. 105 </para> 106 <para> 107 For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10), 108 the traditional approach is to "reseed" a cluster by upgrading a single 109 node with <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html">pg_upgrade</ulink> 110 and recloning standbys from this. 111 </para> 112 <para> 113 To minimize downtime during major upgrades from PostgreSQL 9.4 and later, 114 <ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink> 115 can be used to set up a parallel cluster using the newer PostgreSQL version, 116 which can be kept in sync with the existing production cluster until the 117 new cluster is ready to be put into production. 118 </para> 119 </sect2> 120 121 <sect2 id="faq-libdir-repmgr-error"> 122 <title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title> 123 <para> 124 It means the &repmgr; extension code is not installed in the 125 PostgreSQL application directory. This typically happens when using PostgreSQL 126 packages provided by a third-party vendor, which often have different 127 filesystem layouts. 128 </para> 129 <para> 130 Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this 131 is not possible, contact your vendor for assistance. 132 </para> 133 </sect2> 134 135 <sect2 id="faq-old-packages"> 136 <title>How can I obtain old versions of &repmgr; packages?</title> 137 <para> 138 See appendix <xref linkend="packages-old-versions"/> for details. 139 </para> 140 </sect2> 141 142 <sect2 id="faq-repmgr-required-for-replication"> 143 <title>Is &repmgr; required for streaming replication?</title> 144 <para> 145 No. 146 </para> 147 <para> 148 &repmgr; (together with &repmgrd;) assists with 149 <emphasis>managing</emphasis> replication. It does not actually perform replication, which 150 is part of the core PostgreSQL functionality. 151 </para> 152 </sect2> 153 154 <sect2 id="faq-what-if-repmgr-uninstalled"> 155 <title>Will replication stop working if &repmgr; is uninstalled?</title> 156 <para> 157 No. See preceding question. 158 </para> 159 </sect2> 160 161 <sect2 id="faq-version-mix"> 162 <title>Does it matter if different &repmgr; versions are present in the replication cluster?</title> 163 <para> 164 Yes. If different "major" &repmgr; versions (e.g. 3.3.x and 4.1.x) are present, 165 &repmgr; (in particular &repmgrd;) 166 may not run, or run properly, or in the worst case (if different &repmgrd; 167 versions are running and there are differences in the failover implementation) break 168 your replication cluster. 169 </para> 170 <para> 171 If different "minor" &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed, 172 &repmgr; will function, but we strongly recommend always running the same version 173 to ensure there are no unexpected suprises, e.g. a newer version behaving slightly 174 differently to the older version. 175 </para> 176 <para> 177 See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>. 178 </para> 179 </sect2> 180 181 <sect2 id="faq-upgrade-repmgr"> 182 <title>Should I upgrade &repmgr;?</title> 183 <para> 184 Yes. 185 </para> 186 <para> 187 We don't release new versions for fun, you know. Upgrading may require a little effort, 188 but running an older &repmgr; version with bugs which have since been fixed may end up 189 costing you more effort. The same applies to PostgreSQL itself. 190 </para> 191 </sect2> 192 193 <sect2 id="faq-repmgr-conf-data-directory"> 194 <title>Why do I need to specify the data directory location in repmgr.conf?</title> 195 <para> 196 In some circumstances &repmgr; may need to access a PostgreSQL data 197 directory while the PostgreSQL server is not running, e.g. to confirm 198 it shut down cleanly during a <link linkend="performing-switchover">switchover</link>. 199 </para> 200 <para> 201 Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and 202 earlier, where the <literal>repmgr</literal> user is not a superuser; in that 203 case the <literal>repmgr</literal> user will not be able to access the 204 <literal>data_directory</literal> configuration setting, access to which is restricted 205 to superusers. 206 </para> 207 <para> 208 In PostgreSQL 10 and later, non-superusers can be added to the 209 <ulink url="https://www.postgresql.org/docs/current/default-roles.html">default role</ulink> 210 <option>pg_read_all_settings</option> (or the meta-role <option>pg_monitor</option>) 211 which will enable them to read this setting. 212 </para> 213 </sect2> 214 215 <sect2 id="faq-third-party-packages" xreflabel="Compatability with third party vendor packages"> 216 <title>Are &repmgr; packages compatible with <literal>$third_party_vendor</literal>'s packages?</title> 217 <para> 218 &repmgr; packages provided by 2ndQuadrant are compatible with the community-provided PostgreSQL 219 packages and any software provided by 2ndQuadrant. 220 </para> 221 <para> 222 A number of other vendors provide their own versions of PostgreSQL packages, often with different 223 package naming schemes and/or file locations. 224 </para> 225 <para> 226 We cannot guarantee that &repmgr; packages will be compatible with these packages. 227 It may be possible to override package dependencies (e.g. <literal>rpm --nodeps</literal> 228 for CentOS-based systems or <literal>dpkg --force-depends</literal> for Debian-based systems). 229 </para> 230 </sect2> 231 </sect1> 232 233 <sect1 id="faq-repmgr" xreflabel="repmgr"> 234 <title><command>repmgr</command></title> 235 236 <sect2 id="faq-register-existing-node" xreflabel="registering an existing node"> 237 <title>Can I register an existing PostgreSQL server with repmgr?</title> 238 <para> 239 Yes, any existing PostgreSQL server which is part of the same replication 240 cluster can be registered with &repmgr;. There's no requirement for a 241 standby to have been cloned using &repmgr;. 242 </para> 243 </sect2> 244 245 <sect2 id="faq-repmgr-clone-other-source" > 246 <title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title> 247 248 <para> 249 For a standby which has been manually cloned or recovered from an external 250 backup manager such as Barman, the command 251 <command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command> 252 can be used to create the correct <filename>recovery.conf</filename> file for 253 use with &repmgr; (and will create a replication slot if required). Once this has been done, 254 <link linkend="repmgr-standby-register">register the node</link> as usual. 255 </para> 256 </sect2> 257 258 <sect2 id="faq-repmgr-recovery-conf" > 259 <title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title> 260 <para> 261 See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>. 262 </para> 263 </sect2> 264 265 <sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby"> 266 <title>How can a failed primary be re-added as a standby?</title> 267 <para> 268 This is a two-stage process. First, the failed primary's data directory 269 must be re-synced with the current primary; secondly the failed primary 270 needs to be re-registered as a standby. 271 </para> 272 <para> 273 It's possible to use <command>pg_rewind</command> to re-synchronise the existing data 274 directory, which will usually be much 275 faster than re-cloning the server. However <command>pg_rewind</command> can only 276 be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or 277 data checksums were enabled when the cluster was initialized. 278 </para> 279 <para> 280 Note that <command>pg_rewind</command> is available as part of the core PostgreSQL 281 distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4. 282 </para> 283 <para> 284 &repmgr; provides the command <command>repmgr node rejoin</command> which can 285 optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"/> 286 documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind"/>. 287 </para> 288 <para> 289 If <command>pg_rewind</command> cannot be used, then the data directory will need 290 to be re-cloned from scratch. 291 </para> 292 293 </sect2> 294 295 <sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration"> 296 <title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title> 297 <para> 298 Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command> 299 with the <literal>--dry-run</literal> option; this will report any configuration problems 300 which need to be rectified. 301 </para> 302 </sect2> 303 304 <sect2 id="faq-repmgr-clone-skip-config-files" xreflabel=""> 305 <title>When cloning a standby, how can I get &repmgr; to copy 306 <filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration 307 directory in <filename>/etc</filename>?</title> 308 <para> 309 Use the command line option <literal>--copy-external-config-files</literal>. For more details 310 see <xref linkend="repmgr-standby-clone-config-file-copying"/>. 311 </para> 312 </sect2> 313 314 <sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd"> 315 <title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal> 316 in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title> 317 <para> 318 No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;. 319 If you later decide to run &repmgrd;, you just need to add 320 <literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL. 321 </para> 322 </sect2> 323 324 <sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems"> 325 <title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename> 326 but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title> 327 <para> 328 <command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database 329 with a normal connection to query metadata. The <literal>replication</literal> connection 330 permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user). 331 </para> 332 </sect2> 333 334 <sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters"> 335 <title>When cloning a standby, why do I need to provide the connection parameters 336 for the primary server on the command line, not in the configuration file?</title> 337 <para> 338 Cloning a standby is a one-time action; the role of the server being cloned 339 from could change, so fixing it in the configuration file would create 340 confusion. If &repmgr; needs to establish a connection to the primary 341 server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local 342 node, and if necessary scan the replication cluster until it locates the active primary. 343 </para> 344 </sect2> 345 346 <sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory"> 347 <title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title> 348 <para> 349 Provide the option <literal>--waldir</literal> (<literal>--xlogdir</literal> in PostgreSQL 9.6 350 and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>. 351 For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>. 352 </para> 353 <para> 354 In &repmgr; 5.2 and later, this setting will also be honoured when cloning from Barman. 355 </para> 356 </sect2> 357 358 <sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events"> 359 <title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal> 360 table?</title> 361 <para> 362 Under some circumstances event notifications can be generated for servers 363 which have not yet been registered; it's also useful to retain a record 364 of events which includes servers removed from the replication cluster 365 which no longer have an entry in the <literal>repmgr.nodes</literal> table. 366 </para> 367 </sect2> 368 369 <sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf"> 370 <title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title> 371 <para> 372 This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename> 373 are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed. 374 </para> 375 <para> 376 The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting 377 of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes, 378 even if the string does not contain any characters which need escaping. 379 </para> 380 </sect2> 381 382 <sect2 id="faq-repmgr-exclude-metadata-from-dump" xreflabel="Excluding repmgr metadata from pg_dump output"> 383 <title>How can I exclude &repmgr; metadata from <application>pg_dump</application> output?</title> 384 <para> 385 Beginning with &repmgr; 5.2, the metadata tables associated with the &repmgr; extension 386 (<literal>repmgr.nodes</literal>, <literal>repmgr.events</literal> and <literal>repmgr.monitoring_history</literal>) 387 have been marked as dumpable as they contain configuration and user-generated data. 388 </para> 389 <para> 390 To exclude these from <application>pg_dump</application> output, add the flag <option>--exclude-schema=repmgr</option>. 391 </para> 392 <para> 393 To exclude individual &repmgr; metadata tables from <application>pg_dump</application> output, add the flag 394 e.g. <option>--exclude-table=repmgr.monitoring_history</option>. This flag can be provided multiple times 395 to exclude individual tables, 396 </para> 397 </sect2> 398 399 </sect1> 400 401 <sect1 id="faq-repmgrd" xreflabel="repmgrd"> 402 <title>&repmgrd;</title> 403 404 405 <sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary"> 406 <title>How can I prevent a node from ever being promoted to primary?</title> 407 <para> 408 In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with 409 <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>. 410 </para> 411 <para> 412 Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never 413 be considered as a promotion candidate. 414 </para> 415 </sect2> 416 417 <sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support"> 418 <title>Does &repmgrd; support delayed standbys?</title> 419 <para> 420 &repmgrd; can monitor delayed standbys - those set up with 421 <varname>recovery_min_apply_delay</varname> set to a non-zero value 422 in <filename>recovery.conf</filename> - but as it's not currently possible 423 to directly examine the value applied to the standby, &repmgrd; 424 may not be able to properly evaluate the node as a promotion candidate. 425 </para> 426 <para> 427 We recommend that delayed standbys are explicitly excluded from promotion 428 by setting <varname>priority</varname> to <literal>0</literal> in 429 <filename>repmgr.conf</filename>. 430 </para> 431 <para> 432 Note that after registering a delayed standby, &repmgrd; will only start 433 once the metadata added in the primary node has been replicated. 434 </para> 435 </sect2> 436 437 <sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation"> 438 <title>How can I get &repmgrd; to rotate its logfile?</title> 439 <para> 440 Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation"/>. 441 </para> 442 443 </sect2> 444 445 <sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned"> 446 <title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title> 447 <para> 448 Check you registered the standby after recloning. If unregistered, the standby 449 cannot be considered as a promotion candidate even if <varname>failover</varname> is set to 450 <literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if 451 <varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still 452 be monitored, if desired. 453 </para> 454 </sect2> 455 456 <sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command"> 457 <title> 458 &repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname> 459 </title> 460 <para> 461 <varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts, 462 so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full 463 path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details. 464 </para> 465 </sect2> 466 467 <sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running"> 468 <title> 469 &repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>" 470 </title> 471 <para> 472 &repmgrd; does this to avoid starting up on a replication cluster 473 which is not in a healthy state. If the upstream is unavailable, &repmgrd; 474 may initiate a failover immediately after starting up, which could have unintended side-effects, 475 particularly if &repmgrd; is not running on other nodes. 476 </para> 477 <para> 478 In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy 479 is out-of-date, which may lead to incorrect failover behaviour. 480 </para> 481 <para> 482 The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before 483 starting &repmgrd;. 484 </para> 485 </sect2> 486 487 </sect1> 488</appendix> 489