1<appendix id="appendix-faq" xreflabel="FAQ">
2
3  <title>FAQ (Frequently Asked Questions)</title>
4
5  <indexterm>
6    <primary>FAQ (Frequently Asked Questions)</primary>
7  </indexterm>
8
9  <sect1 id="faq-general" xreflabel="General">
10    <title>General</title>
11
12    <sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
13      <title>What's the difference between the repmgr versions?</title>
14      <para>
15        &repmgr; 4 is a complete rewrite of the previous &repmgr; code base
16        and implements &repmgr; as a PostgreSQL extension. It
17        supports all PostgreSQL versions from 9.3 (although some &repmgr;
18        features are not available for PostgreSQL 9.3 and 9.4).
19      </para>
20      <note>
21        <para>
22          &repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides
23          support for the revised replication configuration mechanism in PostgreSQL 12.
24        </para>
25        <para>
26          Support for PostgreSQL 9.3 is no longer available from &repmgr; 5.2.
27        </para>
28      </note>
29      <para>
30        &repmgr; 3.x builds on the improved replication facilities added
31        in PostgreSQL 9.3, as well as improved automated failover support
32        via &repmgrd;, and is not compatible with PostgreSQL 9.2
33        and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
34        series is no longer maintained.
35      </para>
36      <para>
37        &repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
38        with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
39        no longer maintained.
40      </para>
41      <para>
42        See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
43        and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
44      </para>
45    </sect2>
46
47    <sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
48      <title>What's the advantage of using replication slots?</title>
49      <para>
50        Replication slots, introduced in PostgreSQL 9.4, ensure that the
51        primary server will retain WAL files until they have been consumed
52        by all standby servers. This means standby servers should never
53        fail due to not being able to retrieve required WAL files from the
54        primary.
55      </para>
56      <para>
57        However this does mean that if a standby is no longer connected to the
58        primary, the presence of the replication slot will cause WAL files
59        to be retained indefinitely, and eventually lead to disk space
60        exhaustion.
61      </para>
62
63      <tip>
64        <para>
65          2ndQuadrant's recommended configuration is to configure
66          <ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
67          source of WAL files, rather than maintain replication slots for
68          each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
69        </para>
70      </tip>
71    </sect2>
72
73    <sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
74      <title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
75      <para>
76        Normally at least same number as the number of standbys which will connect
77        to the node. Note that changes to <varname>max_replication_slots</varname> require a server
78        restart to take effect, and as there is no particular penalty for unused
79        replication slots, setting a higher figure will make adding new nodes
80        easier.
81      </para>
82    </sect2>
83
84    <sect2 id="faq-hash-index" xreflabel="Hash indexes">
85      <title>Does &repmgr; support hash indexes?</title>
86      <para>
87        Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
88        for use in streaming replication in PostgreSQL 9.6 and earlier. See the
89        <ulink url="https://www.postgresql.org/docs/9.6/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
90        for details.
91      </para>
92      <para>
93        From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
94        in a streaming replication cluster.
95      </para>
96    </sect2>
97
98    <sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
99      <title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
100      <para>
101        For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
102        approach is to upgrade a standby to the latest version, perform a
103        <link linkend="performing-switchover">switchover</link> promoting it to a primary,
104        then upgrade the former primary.
105      </para>
106      <para>
107        For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
108        the traditional approach is to "reseed" a cluster by upgrading a single
109        node with <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html">pg_upgrade</ulink>
110        and recloning standbys from this.
111      </para>
112      <para>
113        To minimize downtime during major upgrades from PostgreSQL 9.4 and later,
114        <ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
115        can be used to set up a parallel cluster using the newer PostgreSQL version,
116        which can be kept in sync with the existing production cluster until the
117        new cluster is ready to be put into production.
118      </para>
119    </sect2>
120
121    <sect2 id="faq-libdir-repmgr-error">
122      <title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
123      <para>
124        It means the &repmgr; extension code is not installed in the
125        PostgreSQL application directory. This typically happens when using PostgreSQL
126        packages provided by a third-party vendor, which often have different
127        filesystem layouts.
128      </para>
129      <para>
130        Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
131        is not possible, contact your vendor for assistance.
132      </para>
133    </sect2>
134
135    <sect2 id="faq-old-packages">
136      <title>How can I obtain old versions of &repmgr; packages?</title>
137      <para>
138        See appendix <xref linkend="packages-old-versions"/> for details.
139      </para>
140    </sect2>
141
142    <sect2 id="faq-repmgr-required-for-replication">
143      <title>Is &repmgr; required for streaming replication?</title>
144      <para>
145        No.
146      </para>
147      <para>
148        &repmgr; (together with &repmgrd;) assists with
149        <emphasis>managing</emphasis> replication. It does not actually perform replication, which
150        is part of the core PostgreSQL functionality.
151      </para>
152    </sect2>
153
154    <sect2 id="faq-what-if-repmgr-uninstalled">
155      <title>Will replication stop working if &repmgr; is uninstalled?</title>
156      <para>
157        No. See preceding question.
158      </para>
159    </sect2>
160
161    <sect2 id="faq-version-mix">
162      <title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
163      <para>
164        Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
165        &repmgr; (in particular &repmgrd;)
166        may not run, or run properly, or in the worst case (if different &repmgrd;
167        versions are running and there are differences in the failover implementation) break
168        your replication cluster.
169      </para>
170      <para>
171        If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
172        &repmgr; will function, but we strongly recommend always running the same version
173        to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
174        differently to the older version.
175      </para>
176      <para>
177        See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
178      </para>
179    </sect2>
180
181    <sect2 id="faq-upgrade-repmgr">
182      <title>Should I upgrade &repmgr;?</title>
183      <para>
184        Yes.
185      </para>
186      <para>
187        We don't release new versions for fun, you know. Upgrading may require a little effort,
188        but running an older &repmgr; version with bugs which have since been fixed may end up
189        costing you more effort. The same applies to PostgreSQL itself.
190      </para>
191    </sect2>
192
193    <sect2 id="faq-repmgr-conf-data-directory">
194      <title>Why do I need to specify the data directory location in repmgr.conf?</title>
195      <para>
196        In some circumstances &repmgr; may need to access a PostgreSQL data
197        directory while the PostgreSQL server is not running, e.g. to confirm
198        it shut down cleanly during a <link linkend="performing-switchover">switchover</link>.
199      </para>
200      <para>
201        Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
202        earlier, where the <literal>repmgr</literal> user is not a superuser; in that
203        case the <literal>repmgr</literal> user will not be able to access the
204        <literal>data_directory</literal> configuration setting, access to which is restricted
205        to superusers.
206      </para>
207      <para>
208        In PostgreSQL 10 and later, non-superusers can be added to the
209        <ulink url="https://www.postgresql.org/docs/current/default-roles.html">default role</ulink>
210        <option>pg_read_all_settings</option> (or the meta-role <option>pg_monitor</option>)
211        which will enable them to read this setting.
212      </para>
213    </sect2>
214
215    <sect2 id="faq-third-party-packages" xreflabel="Compatability with third party vendor packages">
216      <title>Are &repmgr; packages compatible with <literal>$third_party_vendor</literal>'s packages?</title>
217      <para>
218        &repmgr; packages provided by 2ndQuadrant are compatible with the community-provided PostgreSQL
219        packages and any software provided by 2ndQuadrant.
220      </para>
221      <para>
222        A number of other vendors provide their own versions of PostgreSQL packages, often with different
223        package naming schemes and/or file locations.
224      </para>
225      <para>
226        We cannot guarantee that &repmgr; packages will be compatible with these packages.
227        It may be possible to override package dependencies (e.g. <literal>rpm --nodeps</literal>
228        for CentOS-based systems or <literal>dpkg --force-depends</literal> for Debian-based systems).
229      </para>
230    </sect2>
231  </sect1>
232
233  <sect1 id="faq-repmgr" xreflabel="repmgr">
234    <title><command>repmgr</command></title>
235
236    <sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
237      <title>Can I register an existing PostgreSQL server with repmgr?</title>
238      <para>
239        Yes, any existing PostgreSQL server which is part of the same replication
240        cluster can be registered with &repmgr;. There's no requirement for a
241        standby to have been cloned using &repmgr;.
242      </para>
243    </sect2>
244
245    <sect2 id="faq-repmgr-clone-other-source" >
246      <title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
247
248      <para>
249        For a standby which has been manually cloned or recovered from an external
250        backup manager such as Barman, the command
251        <command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>
252        can be used to create the correct <filename>recovery.conf</filename> file for
253        use with &repmgr; (and will create a replication slot if required). Once this has been done,
254        <link linkend="repmgr-standby-register">register the node</link> as usual.
255      </para>
256    </sect2>
257
258    <sect2 id="faq-repmgr-recovery-conf" >
259      <title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
260      <para>
261        See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
262      </para>
263    </sect2>
264
265    <sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
266      <title>How can a failed primary be re-added as a standby?</title>
267      <para>
268        This is a two-stage process. First, the failed primary's data directory
269        must be re-synced with the current primary; secondly the failed primary
270        needs to be re-registered as a standby.
271      </para>
272      <para>
273        It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
274        directory, which will usually be much
275        faster than re-cloning the server. However <command>pg_rewind</command> can only
276        be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
277        data checksums were enabled when the cluster was initialized.
278      </para>
279      <para>
280        Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
281        distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
282      </para>
283      <para>
284        &repmgr; provides the command <command>repmgr node rejoin</command> which can
285        optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"/>
286        documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
287      </para>
288      <para>
289        If <command>pg_rewind</command> cannot be used, then the data directory will need
290        to be re-cloned from scratch.
291      </para>
292
293    </sect2>
294
295    <sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
296      <title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
297      <para>
298        Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
299        with the <literal>--dry-run</literal> option; this will report any configuration problems
300        which need to be rectified.
301      </para>
302    </sect2>
303
304    <sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
305      <title>When cloning a standby, how can I get &repmgr; to copy
306        <filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
307        directory in <filename>/etc</filename>?</title>
308      <para>
309        Use the command line option <literal>--copy-external-config-files</literal>. For more details
310        see <xref linkend="repmgr-standby-clone-config-file-copying"/>.
311      </para>
312    </sect2>
313
314    <sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
315      <title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
316        in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
317      <para>
318        No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;.
319        If you later decide to run &repmgrd;, you just need to add
320        <literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
321      </para>
322    </sect2>
323
324    <sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
325      <title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
326        but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title>
327      <para>
328        <command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database
329        with a normal connection to query metadata. The <literal>replication</literal> connection
330        permission is for PostgreSQL's streaming replication (and doesn't  necessarily need to be the <literal>repmgr</literal> user).
331      </para>
332    </sect2>
333
334    <sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
335      <title>When cloning a standby, why do I need to provide the connection parameters
336        for the primary server on the command line, not in the configuration file?</title>
337      <para>
338        Cloning a standby is a one-time action; the role of the server being cloned
339        from could change, so fixing it in the configuration file would create
340        confusion. If &repmgr; needs to establish a connection to the primary
341        server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
342        node, and if necessary scan the replication cluster until it locates the active primary.
343      </para>
344    </sect2>
345
346    <sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
347      <title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
348      <para>
349        Provide the option <literal>--waldir</literal>  (<literal>--xlogdir</literal> in PostgreSQL 9.6
350        and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
351        For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>.
352      </para>
353      <para>
354        In &repmgr; 5.2 and later, this setting will also be honoured when cloning from Barman.
355      </para>
356    </sect2>
357
358    <sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
359      <title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
360        table?</title>
361      <para>
362        Under some circumstances event notifications can be generated for servers
363        which have not yet been registered; it's also useful to retain a record
364        of events which includes servers removed from the replication cluster
365        which no longer have an entry in the <literal>repmgr.nodes</literal> table.
366      </para>
367    </sect2>
368
369    <sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf">
370      <title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title>
371      <para>
372        This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
373        are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed.
374      </para>
375      <para>
376        The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
377        of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
378        even if the string does not contain any characters which need escaping.
379      </para>
380    </sect2>
381
382    <sect2 id="faq-repmgr-exclude-metadata-from-dump" xreflabel="Excluding repmgr metadata from pg_dump output">
383      <title>How can I exclude &repmgr; metadata from <application>pg_dump</application> output?</title>
384      <para>
385        Beginning with &repmgr; 5.2, the metadata tables associated with the &repmgr; extension
386        (<literal>repmgr.nodes</literal>, <literal>repmgr.events</literal> and <literal>repmgr.monitoring_history</literal>)
387        have been marked as dumpable as they contain configuration and user-generated data.
388      </para>
389      <para>
390        To exclude these from <application>pg_dump</application> output, add the flag <option>--exclude-schema=repmgr</option>.
391      </para>
392      <para>
393        To exclude individual &repmgr; metadata tables from <application>pg_dump</application> output, add the flag
394        e.g. <option>--exclude-table=repmgr.monitoring_history</option>. This flag can be provided multiple times
395        to exclude individual tables,
396      </para>
397    </sect2>
398
399  </sect1>
400
401  <sect1 id="faq-repmgrd" xreflabel="repmgrd">
402    <title>&repmgrd;</title>
403
404
405    <sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
406      <title>How can I prevent a node from ever being promoted to primary?</title>
407      <para>
408        In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
409        <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
410      </para>
411      <para>
412        Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
413        be considered as a promotion candidate.
414      </para>
415    </sect2>
416
417    <sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
418      <title>Does &repmgrd; support delayed standbys?</title>
419      <para>
420        &repmgrd; can monitor delayed standbys - those set up with
421        <varname>recovery_min_apply_delay</varname> set to a non-zero value
422        in <filename>recovery.conf</filename> - but as it's not currently possible
423        to directly examine the value applied to the standby, &repmgrd;
424        may not be able to properly evaluate the node as a promotion candidate.
425      </para>
426      <para>
427        We recommend that delayed standbys are explicitly excluded from promotion
428        by setting <varname>priority</varname> to <literal>0</literal> in
429        <filename>repmgr.conf</filename>.
430      </para>
431      <para>
432        Note that after registering a delayed standby, &repmgrd; will only start
433        once the metadata added in the primary node has been replicated.
434      </para>
435    </sect2>
436
437    <sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
438      <title>How can I get &repmgrd; to rotate its logfile?</title>
439      <para>
440        Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation"/>.
441      </para>
442
443    </sect2>
444
445    <sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
446      <title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title>
447      <para>
448        Check you registered the standby after recloning. If unregistered, the standby
449        cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
450        <literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if
451        <varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
452        be monitored, if desired.
453      </para>
454    </sect2>
455
456    <sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
457      <title>
458        &repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
459      </title>
460      <para>
461        <varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
462        so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
463        path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details.
464      </para>
465    </sect2>
466
467    <sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
468      <title>
469        &repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
470      </title>
471      <para>
472        &repmgrd; does this to avoid starting up on a replication cluster
473        which is not in a healthy state. If the upstream is unavailable, &repmgrd;
474        may initiate a failover immediately after starting up, which could have unintended side-effects,
475        particularly if &repmgrd; is not running on other nodes.
476      </para>
477      <para>
478        In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
479        is out-of-date, which may lead to incorrect failover behaviour.
480      </para>
481      <para>
482        The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
483        starting &repmgrd;.
484      </para>
485    </sect2>
486
487  </sect1>
488</appendix>
489