1<chapter id="performing-switchover" xreflabel="Performing a switchover with repmgr">
2 <title>Performing a switchover with repmgr</title>
3
4 <indexterm>
5  <primary>switchover</primary>
6 </indexterm>
7
8 <para>
9  A typical use-case for replication is a combination of primary and standby
10  server, with the standby serving as a backup which can easily be activated
11  in case of a problem with the primary. Such an unplanned failover would
12  normally be handled by promoting the standby, after which an appropriate
13  action must be taken to restore the old primary.
14 </para>
15 <para>
16  In some cases however it's desirable to promote the standby in a planned
17  way, e.g. so maintenance can be performed on the primary; this kind of switchover
18  is supported by the <xref linkend="repmgr-standby-switchover"/> command.
19 </para>
20 <para>
21  <command>repmgr standby switchover</command> differs from other &repmgr;
22  actions in that it also performs actions on other servers (the demotion
23  candidate, and optionally any other servers which are to follow the new primary),
24  which means passwordless SSH access is required to those servers from the one where
25  <command>repmgr standby switchover</command> is executed.
26 </para>
27 <note>
28  <simpara>
29   <command>repmgr standby switchover</command> performs a relatively complex
30   series of operations on two servers, and should therefore be performed after
31   careful preparation and with adequate attention. In particular you should
32   be confident that your network environment is stable and reliable.
33  </simpara>
34  <simpara>
35   Additionally you should be sure that the current primary can be shut down
36   quickly and cleanly. In particular, access from applications should be
37   minimalized or preferably blocked completely. Also be aware that if there
38   is a backlog of files waiting to be archived, PostgreSQL will not shut
39   down until archiving completes.
40  </simpara>
41  <simpara>
42    We recommend running <command>repmgr standby switchover</command> at the
43    most verbose logging level (<literal>--log-level=DEBUG --verbose</literal>)
44    and capturing all output to assist troubleshooting any problems.
45  </simpara>
46  <simpara>
47   Please also read carefully the sections <xref linkend="preparing-for-switchover"/> and
48   <xref linkend="switchover-caveats"/> below.
49  </simpara>
50 </note>
51
52 <sect1 id="preparing-for-switchover" xreflabel="Preparing for switchover">
53   <title>Preparing for switchover</title>
54
55   <indexterm>
56     <primary>switchover</primary>
57     <secondary>preparation</secondary>
58   </indexterm>
59
60   <para>
61    As mentioned in the previous section, success of the switchover operation depends on
62    &repmgr; being able to shut down the current primary server quickly and cleanly.
63   </para>
64
65   <para>
66     Ensure that the promotion candidate has sufficient free walsenders available
67     (PostgreSQL configuration item <varname>max_wal_senders</varname>), and if replication
68     slots are in use, at least one free slot is available for the demotion candidate (
69     PostgreSQL configuration item <varname>max_replication_slots</varname>).
70   </para>
71
72   <para>
73     Ensure that a passwordless SSH connection is possible from the promotion candidate
74     (standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
75     will be used, ensure that passwordless SSH connections are possible from the
76     promotion candidate to all nodes attached to the demotion candidate
77     (including the witness server, if in use).
78   </para>
79
80   <note>
81     <simpara>
82       &repmgr; expects to find the &repmgr; binary in the same path on the remote
83       server as on the local server.
84     </simpara>
85   </note>
86
87   <para>
88    Double-check which commands will be used to stop/start/restart the current
89    primary; this can be done by e.g. executing <command><link linkend="repmgr-node-service">repmgr node service</link></command>
90    on the current primary:
91    <programlisting>
92     repmgr -f /etc/repmgr.conf node service --list-actions --action=stop
93     repmgr -f /etc/repmgr.conf node service --list-actions --action=start
94     repmgr -f /etc/repmgr.conf node service --list-actions --action=restart</programlisting>
95
96   </para>
97
98   <para>
99     These commands can be defined in <filename>repmgr.conf</filename> with
100     <option>service_start_command</option>, <option>service_stop_command</option>
101     and <option>service_restart_command</option>.
102   </para>
103
104   <important>
105     <para>
106       If &repmgr; is installed from a package. you should set these commands
107       to use the appropriate service commands defined by the package/operating
108       system as these will ensure PostgreSQL is stopped/started properly
109       taking into account configuration and log file locations etc.
110     </para>
111     <para>
112       If the <option>service_*_command</option> options aren't defined, &repmgr; will
113       fall back to using <application>pg_ctl</application> to stop/start/restart
114       PostgreSQL, which may not work properly, particularly when executed on a remote
115       server.
116     </para>
117     <para>
118       For more details, see <xref linkend="configuration-file-service-commands"/>.
119     </para>
120   </important>
121
122   <note>
123    <simpara>
124     On <literal>systemd</literal> systems we strongly recommend using the appropriate
125     <command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
126     <literal>systemd</literal> is informed about the status of the PostgreSQL service.
127    </simpara>
128    <simpara>
129     If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
130     <command>sudo</command> specification doesn't require a real tty for the user. If not set
131     this way, <command>repmgr</command> will fail to stop the primary.
132    </simpara>
133   </note>
134
135   <para>
136     Check that access from applications is minimalized or preferably blocked
137     completely, so applications are not unexpectedly interrupted.
138   </para>
139
140   <note>
141     <para>
142       If an exclusive backup is running on the current primary, or if WAL replay is paused on the standby,
143       &repmgr; will <emphasis>not</emphasis> perform the switchover.
144     </para>
145   </note>
146
147   <para>
148     Check there is no significant replication lag on standbys attached to the
149     current primary.
150   </para>
151
152   <para>
153    If WAL file archiving is set up, check that there is no backlog of files waiting
154    to be archived, as PostgreSQL will not finally shut down until all of these have been
155    archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files,
156    &repmgr; will emit a warning before attempting to perform a switchover; you can also check
157    manually with <command>repmgr node check --archive-ready</command>.
158   </para>
159
160    <note>
161      <para>
162        From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
163        &repmgrd; instances to pause operations while the switchover
164        is being carried out, to prevent &repmgrd; from
165        unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing"/>.
166      </para>
167      <para>
168        Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd;
169        is not running on any nodes while a switchover is being executed.
170      </para>
171    </note>
172
173
174   <para>
175    Finally, consider executing <command>repmgr standby switchover</command> with the
176    <literal>--dry-run</literal> option; this will perform any necessary checks and inform you about
177    success/failure, and stop before the first actual command is run (which would be the shutdown of the
178    current primary). Example output:
179    <programlisting>
180      $ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run
181      NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
182      INFO: SSH connection to host "node1" succeeded
183      INFO: archive mode is "off"
184      INFO: replication lag on this standby is 0 seconds
185      INFO: all sibling nodes are reachable via SSH
186      NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
187      INFO: following shutdown command would be run on node "node1":
188        "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
189      INFO: parameter "shutdown_check_timeout" is set to 60 seconds
190    </programlisting>
191   </para>
192
193   <important>
194     <para>
195       Be aware that <option>--dry-run</option> checks the prerequisites
196       for performing the switchover and some basic sanity checks on the
197       state of the database which might effect the switchover operation
198       (e.g. replication lag); it cannot however guarantee the switchover
199       operation will succeed. In particular, if the current primary
200       does not shut down cleanly, &repmgr; will not be able to reliably
201       execute the switchover (as there would be a danger of divergence
202       between the former and new primary nodes).
203     </para>
204   </important>
205
206
207   <note>
208     <simpara>
209       See <xref linkend="repmgr-standby-switchover"/> for a full list of available
210       command line options and <filename>repmgr.conf</filename> settings relevant
211       to performing a switchover.
212     </simpara>
213   </note>
214
215   <sect2 id="switchover-pg-rewind" xreflabel="Switchover and pg_rewind">
216    <title>Switchover and pg_rewind</title>
217
218    <indexterm>
219      <primary>pg_rewind</primary>
220      <secondary>using with "repmgr standby switchover"</secondary>
221    </indexterm>
222    <para>
223      If the demotion candidate does not shut down smoothly or cleanly, there's a risk it
224      will have a slightly divergent timeline and will not be able to attach to the new
225      primary. To fix this situation without needing to reclone the old primary, it's
226      possible to use the <application>pg_rewind</application> utility, which will usually be
227      able to resync the two servers.
228    </para>
229    <para>
230      To have &repmgr; execute <application>pg_rewind</application> if it detects this
231      situation after promoting the new primary, add the <option>--force-rewind</option>
232      option.
233    </para>
234    <note>
235      <simpara>
236        If &repmgr; detects a situation where it needs to execute <application>pg_rewind</application>,
237        it will execute a <literal>CHECKPOINT</literal> on the new primary before executing
238        <application>pg_rewind</application>.
239      </simpara>
240    </note>
241    <para>
242      For more details on <application>pg_rewind</application>, see:
243      <ulink url="https://www.postgresql.org/docs/current/app-pgrewind.html">https://www.postgresql.org/docs/current/app-pgrewind.html</ulink>.
244    </para>
245    <para>
246      <application>pg_rewind</application> has been part of the core PostgreSQL distribution since
247      version 9.5. Users of PostgreSQL 9.4 will need to manually install it; the source code is available here:
248      <ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
249      If the <application>pg_rewind</application>
250      binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
251      its full path  on the demotion candidate  with <option>--force-rewind</option>.
252    </para>
253    <para>
254      Note that building the 9.4 version of <application>pg_rewind</application> requires the PostgreSQL
255      source code.
256    </para>
257  </sect2>
258
259
260 </sect1>
261
262 <sect1 id="switchover-execution" xreflabel="Executing the switchover command">
263  <title>Executing the switchover command</title>
264
265  <indexterm>
266   <primary>switchover</primary>
267    <secondary>execution</secondary>
268  </indexterm>
269  <para>
270   To demonstrate switchover, we will assume a replication cluster with a
271   primary (<literal>node1</literal>) and one standby (<literal>node2</literal>);
272   after the switchover <literal>node2</literal> should become the primary with
273   <literal>node1</literal> following it.
274  </para>
275  <para>
276   The switchover command must be run from the standby which is to be promoted,
277   and in its simplest form looks like this:
278   <programlisting>
279    $ repmgr -f /etc/repmgr.conf standby switchover
280    NOTICE: executing switchover on node "node2" (ID: 2)
281    INFO: searching for primary node
282    INFO: checking if node 1 is primary
283    INFO: current primary node is 1
284    INFO: SSH connection to host "node1" succeeded
285    INFO: archive mode is "off"
286    INFO: replication lag on this standby is 0 seconds
287    NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
288    NOTICE: stopping current primary node "node1" (ID: 1)
289    NOTICE: issuing CHECKPOINT
290    DETAIL: executing server command "pg_ctl -l /var/log/postgres/startup.log -D '/var/lib/pgsql/data' -m fast -W stop"
291    INFO: checking primary status; 1 of 6 attempts
292    NOTICE: current primary has been cleanly shut down at location 0/3001460
293    NOTICE: promoting standby to primary
294    DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote"
295    server promoting
296    NOTICE: STANDBY PROMOTE successful
297    DETAIL: server "node2" (ID: 2) was successfully promoted to primary
298    INFO: setting node 1's primary to node 2
299    NOTICE: starting server using  "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' restart"
300    NOTICE: NODE REJOIN successful
301    DETAIL: node 1 is now attached to node 2
302    NOTICE: switchover was successful
303    DETAIL: node "node2" is now primary
304    NOTICE: STANDBY SWITCHOVER is complete
305   </programlisting>
306  </para>
307  <para>
308   The old primary is now replicating as a standby from the new primary, and the
309   cluster status will now look like this:
310   <programlisting>
311    $ repmgr -f /etc/repmgr.conf cluster show
312     ID | Name  | Role    | Status    | Upstream | Location | Connection string
313    ----+-------+---------+-----------+----------+----------+--------------------------------------
314     1  | node1 | standby |   running | node2    | default  | host=node1 dbname=repmgr user=repmgr
315     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
316   </programlisting>
317  </para>
318  <para>
319    If &repmgrd; is in use, it's worth double-checking that
320    all nodes are unpaused by executing
321    <command><link linkend="repmgr-service-status">repmgr service status</link></command>
322    (&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-status">repmgr daemon status</link></command>).
323  </para>
324
325   <note>
326     <para>
327       Users of &repmgr; versions prior to 4.2 will need to manually restart &repmgrd;
328       on all nodes after the switchover is completed.
329     </para>
330    </note>
331
332 </sect1>
333
334
335 <sect1 id="switchover-caveats" xreflabel="Caveats">
336  <title>Caveats</title>
337  <indexterm>
338   <primary>switchover</primary>
339    <secondary>caveats</secondary>
340  </indexterm>
341  <para>
342   <itemizedlist spacing="compact" mark="bullet">
343    <listitem>
344     <simpara>
345      If using PostgreSQL 9.4, you should ensure that the shutdown command
346      is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
347      and later). If relying on <command>pg_ctl</command> to perform database server operations,
348      you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>
349      in <filename>repmgr.conf</filename>.
350     </simpara>
351    </listitem>
352    <listitem>
353     <simpara>
354      <command>pg_rewind</command> *requires* that either <varname>wal_log_hints</varname> is enabled, or that
355      data checksums were enabled when the cluster was initialized. See the
356      <ulink url="https://www.postgresql.org/docs/current/app-pgrewind.html">pg_rewind documentation</ulink>
357      for details.
358     </simpara>
359    </listitem>
360   </itemizedlist>
361  </para>
362 </sect1>
363
364 <sect1 id="switchover-troubleshooting" xreflabel="Troubleshooting">
365   <title>Troubleshooting switchover issues</title>
366
367   <indexterm>
368     <primary>switchover</primary>
369     <secondary>troubleshooting</secondary>
370   </indexterm>
371
372   <para>
373     As <link linkend="performing-switchover">emphasised previously</link>, performing a switchover
374     is a non-trivial operation and there are a number of potential issues which can occur.
375     While &repmgr; attempts to perform sanity checks, there's no guaranteed way of determining the success of
376     a switchover without actually carrying it out.
377   </para>
378
379   <sect2 id="switchover-troubleshooting-primary-shutdown">
380     <title>Demotion candidate (old primary) does not shut down</title>
381     <para>
382       &repmgr; may abort a switchover with a message like:
383       <programlisting>
384ERROR: shutdown of the primary server could not be confirmed
385HINT: check the primary server status before performing any further actions</programlisting>
386     </para>
387     <para>
388       This means the shutdown of the old primary has taken longer than &repmgr; expected,
389       and it has given up waiting.
390     </para>
391     <para>
392       In this case, check the PostgreSQL log on the primary server to see what is going
393       on. It's entirely possible the shutdown process is just taking longer than the
394       timeout set by the configuration parameter <varname>shutdown_check_timeout</varname>
395       (default: 60 seconds), in which case you may need to adjust this parameter.
396     </para>
397     <note>
398       <para>
399         Note that <varname>shutdown_check_timeout</varname> is set on the node where
400         <command>repmgr standby switchover</command> is executed (promotion candidate); setting it on the
401         demotion candidate (former primary) will have no effect.
402       </para>
403     </note>
404     <para>
405       If the primary server has shut down cleanly, and no other node has been promoted,
406       it is safe to restart it, in which case the replication cluster will be restored
407       to its original configuration.
408     </para>
409   </sect2>
410
411   <sect2 id="switchover-troubleshooting-exclusive-backup">
412     <title>Switchover aborts with an &quot;exclusive backup&quot; error</title>
413     <para>
414       &repmgr; may abort a switchover with a message like:
415       <programlisting>
416ERROR: unable to perform a switchover while primary server is in exclusive backup mode
417HINT: stop backup before attempting the switchover</programlisting>
418     </para>
419     <para>
420       This means an exclusive backup is running on the current primary; interrupting this
421       will not only abort the backup, but potentially leave the primary with an ambiguous
422       backup state.
423     </para>
424     <para>
425       To proceed, either wait until the backup has finished, or cancel it with the command
426       <command>SELECT pg_stop_backup()</command>. For more details see the PostgreSQL
427       documentation section
428       <ulink url="https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-EXCLUSIVE">Making an exclusive low level backup</ulink>.
429     </para>
430   </sect2>
431 </sect1>
432
433</chapter>
434