repmgr-5.2.0/doc/repmgrd-configuration.xml

<chapter id="repmgrd-configuration">

  <title>repmgrd setup and configuration</title>

  <indexterm>
    <primary>repmgrd</primary>
    <secondary>configuration</secondary>
  </indexterm>

  <para>
    &repmgrd; is a daemon process which runs on each PostgreSQL node,
    monitoring the local node, and (unless it's the primary node) the upstream server
    (the primary server or with cascading replication, another standby) which it's
    connected to.
  </para>
  <para>
    &repmgrd; can be configured to provide failover
    capability in case the primary or upstream node becomes unreachable, and/or
    provide monitoring data to the &repmgr; metadatabase.
  </para>
  <para>
    From &repmgr; 4.4, when running on the primary node, &repmgrd; can also monitor
    standby disconnections/reconnections (see <xref linkend="repmgrd-primary-child-disconnection"/>).
  </para>

  <sect1 id="repmgrd-basic-configuration">
    <title>repmgrd configuration</title>

    <para>
      To use &repmgrd;, its associated function library <emphasis>must</emphasis> be
      included via <filename>postgresql.conf</filename> with:

      <programlisting>
        shared_preload_libraries = 'repmgr'</programlisting>
    </para>
    <para>
      Changing this setting requires a restart of PostgreSQL; for more details see
      the <ulink url="https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>.
    </para>

    <para>
      The following configuraton options apply to &repmgrd; in all circumstances:
    </para>
    <variablelist>

      <varlistentry>
        <term><option>monitor_interval_secs</option></term>
        <listitem>
          <indexterm>
            <primary>monitor_interval_secs</primary>
          </indexterm>

          <para>
            The interval (in seconds, default: <literal>2</literal>) to check the availability of the upstream node.
          </para>
        </listitem>

      </varlistentry>

      <varlistentry id="connection-check-type">

        <term><option>connection_check_type</option></term>
        <listitem>
          <indexterm>
            <primary>connection_check_type</primary>
          </indexterm>

          <para>
            The option <option>connection_check_type</option> is used to select the method
            &repmgrd; uses to determine whether the upstream node is available.
          </para>
          <para>
            Possible values are:
            <itemizedlist spacing="compact" mark="bullet">
              <listitem>
                  <simpara>
                    <literal>ping</literal> (default) - uses <command>PQping()</command> to
                    determine server availability
                  </simpara>
              </listitem>
              <listitem>
                <simpara>
                  <literal>connection</literal> - determines server availability
                  by attempting to make a new connection to the upstream node
                </simpara>
              </listitem>
              <listitem>
                <simpara>
                  <literal>query</literal> - determines server availability
                  by executing an SQL statement on the node via the existing connection
                </simpara>
              </listitem>

            </itemizedlist>
          </para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><option>reconnect_attempts</option></term>
        <listitem>
          <indexterm>
            <primary>reconnect_attempts</primary>
          </indexterm>
          <para>
            The number of attempts (default: <literal>6</literal>) will be made to reconnect to an unreachable
	        upstream node before initiating a failover.
          </para>
          <para>
            There will be an interval of <option>reconnect_interval</option> seconds between each reconnection
            attempt.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><option>reconnect_interval</option></term>

        <listitem>
          <indexterm>
            <primary>reconnect_interval</primary>
          </indexterm>

          <para>
            Interval (in seconds, default: <literal>10</literal>) between attempts to reconnect to an unreachable
            upstream node.
          </para>
          <para>
              The number of reconnection attempts is defined by the parameter <option>reconnect_attempts</option>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><option>degraded_monitoring_timeout</option></term>
        <listitem>
          <indexterm>
            <primary>degraded_monitoring_timeout</primary>
          </indexterm>

	      <para>
            Interval (in seconds) after which &repmgrd; will terminate if
            either of the servers (local node and or upstream node) being monitored is no longer available
            (<link linkend="repmgrd-degraded-monitoring">degraded monitoring mode</link>).
          </para>
          <para>
            <literal>-1</literal> (default) disables this timeout completely.
          </para>
	    </listitem>
	  </varlistentry>

    </variablelist>

      <para>
        See also <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename> for an annotated sample configuration file.
      </para>

    <sect2 id="repmgrd-automatic-failover-configuration">
      <title>Required configuration for automatic failover</title>

      <para>
        The following &repmgrd; options <emphasis>must</emphasis> be set in
        <filename>repmgr.conf</filename>:

        <itemizedlist spacing="compact" mark="bullet">
          <listitem>
            <simpara><option>failover</option></simpara>
          </listitem>
          <listitem>
            <simpara><option>promote_command</option></simpara>
          </listitem>
          <listitem>
            <simpara><option>follow_command</option></simpara>
          </listitem>
        </itemizedlist>
      </para>


      <para>
        Example:
        <programlisting>
          failover=automatic
          promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
          follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
      </para>
      <para>
        Details of each option are as follows:
      </para>
      <variablelist>
        <varlistentry>

          <term><option>failover</option></term>
          <listitem>
            <indexterm>
              <primary>failover</primary>
            </indexterm>

            <para>
              <option>failover</option> can be one of <literal>automatic</literal> or <literal>manual</literal>.
            </para>
            <note>
              <para>
                If <option>failover</option> is set to <literal>manual</literal>, &repmgrd;
                will not take any action if a failover situation is detected, and the node may need to
                be modified manually (e.g. by executing <command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command>).
              </para>
            </note>

          </listitem>
        </varlistentry>

        <varlistentry>
          <term><option>promote_command</option></term>

          <listitem>
            <indexterm>
              <primary>promote_command</primary>
            </indexterm>

            <para>
              The program or script defined in <option>promote_command</option> will be executed
              in a failover situation when &repmgrd; determines that
              the current node is to become the new primary node.
            </para>
            <para>
              Normally <option>promote_command</option> is set as &repmgr;'s
              <command><link linkend="repmgr-standby-promote">repmgr standby promote</link></command> command.
            </para>

            <note>
              <para>
                When invoking <command>repmgr standby promote</command> (either directly via
                the <option>promote_command</option>, or in a script called
                via <option>promote_command</option>), <option>--siblings-follow</option>
                <emphasis>must not</emphasis> be included as a
                command line option for <command>repmgr standby promote</command>.
              </para>
            </note>

            <para>
              It is also possible to provide a shell script to e.g. perform user-defined tasks
              before promoting the current node. In this case the script <emphasis>must</emphasis>
              at some point execute <command><link linkend="repmgr-standby-promote">repmgr standby promote</link></command>
              to promote the node; if this is not done, &repmgr; metadata will not be updated and
              &repmgr; will no longer function reliably.
            </para>
            <para>
              Example:
              <programlisting>
                promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'</programlisting>
            </para>

            <para>
              Note that the <literal>--log-to-file</literal> option will cause
              output generated by the &repmgr; command, when executed by &repmgrd;,
              to be logged to the same destination configured to receive log output for &repmgrd;.
            </para>
            <note>
              <para>
                &repmgr; will not apply <option>pg_bindir</option> when executing <option>promote_command</option>
                or <option>follow_command</option>; these can be user-defined scripts so must always be
                specified with the full path.
              </para>
            </note>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><option>follow_command</option></term>
          <listitem>
            <indexterm>
              <primary>follow_command</primary>
            </indexterm>

            <para>
              The program or script defined in <option>follow_command</option> will be executed
              in a failover situation when &repmgrd; determines that
              the current node is to follow the new primary node.
            </para>
            <para>
              Normally <option>follow_command</option> is set as &repmgr;'s
              <command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command> command.
            </para>
            <para>
              The <option>follow_command</option> parameter
              should provide the <literal>--upstream-node-id=%n</literal>
              option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
              &repmgrd; with the ID of the new primary node. If this is not provided,
              <command>repmgr standby follow</command> will attempt to determine the new primary by itself, but if the
              original primary comes back online after the new primary is promoted, there is a risk that
              <command>repmgr standby follow</command> will result in the node continuing to follow
              the original primary.
            </para>
            <para>
              It is also possible to provide a shell script to e.g. perform user-defined tasks
              before promoting the current node. In this case the script <emphasis>must</emphasis>
              at some point execute <command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command>
              to promote the node; if this is not done, &repmgr; metadata will not be updated and
              &repmgr; will no longer function reliably.
            </para>
            <para>
              Example:
              <programlisting>
          follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
            </para>

            <para>
              Note that the <literal>--log-to-file</literal> option will cause
              output generated by the &repmgr; command, when executed by &repmgrd;,
              to be logged to the same destination configured to receive log output for &repmgrd;.
            </para>
            <note>
              <para>
                &repmgr; will not apply <option>pg_bindir</option> when executing <option>promote_command</option>
                or <option>follow_command</option>; these can be user-defined scripts so must always be
                specified with the full path.
              </para>
            </note>
          </listitem>

        </varlistentry>

      </variablelist>


    </sect2>

    <sect2 id="repmgrd-automatic-failover-configuration-optional" xreflabel="Optional configuration for automatic failover">
      <title>Optional configuration for automatic failover</title>

      <para>
        The following configuraton options can be use to fine-tune automatic failover:
      </para>
      <variablelist>

        <varlistentry>
          <term><option>priority</option></term>
          <listitem>
            <indexterm>
              <primary>priority</primary>
            </indexterm>

            <para>
              Indicates a preferred priority (default: <literal>100</literal>) for promoting nodes;
			  a value of zero prevents the node being promoted to primary.
            </para>
            <para>
              Note that the priority setting is only applied if two or more nodes are
              determined as promotion candidates; in that case the node with the
              higher priority is selected.
            </para>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><option>failover_validation_command</option></term>
          <listitem>
            <indexterm>
              <primary>failover_validation_command</primary>
            </indexterm>

            <para>
              User-defined script to execute for an external mechanism to validate the failover
	      decision made by &repmgrd;.
            </para>
            <note>
              <para>
                This option <emphasis>must</emphasis> be identically configured
                on all nodes.
              </para>
            </note>
            <para>
              One or more of the following parameter placeholders
			  may be provided, which will be replaced by repmgrd with the appropriate
	          value:
              <itemizedlist spacing="compact" mark="bullet">
                <listitem>
                  <simpara><literal>%n</literal>: node ID</simpara>
                </listitem>
                <listitem>
                  <simpara><literal>%a</literal>: node name</simpara>
                </listitem>
                <listitem>
                  <simpara><literal>%v</literal>: number of visible nodes</simpara>
                </listitem>
                <listitem>
                  <simpara><literal>%u</literal>: number of shared upstream nodes</simpara>
                </listitem>
                <listitem>
                  <simpara><literal>%t</literal>: total number of nodes</simpara>
                </listitem>
              </itemizedlist>
            </para>
            <para>
              See also: <link linkend="repmgrd-failover-validation">Failover validation</link>.
            </para>
          </listitem>
        </varlistentry>


        <varlistentry>
          <term><option>primary_visibility_consensus</option></term>

          <listitem>
            <indexterm>
              <primary>primary_visibility_consensus</primary>
            </indexterm>

            <para>
              If <literal>true</literal>, only continue with failover if no standbys
              (or the witness server, if present) have seen the primary node recently.
            </para>
            <note>
              <para>
                This option <emphasis>must</emphasis> be identically configured
                on all nodes.
              </para>
            </note>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><option>always_promote</option></term>

          <listitem>
            <indexterm>
              <primary>always_promote</primary>
            </indexterm>

            <para>
              Default: <literal>false</literal>.
            </para>
            <para>
              If <literal>true</literal>, promote the local node even if its
              &repmgr; metadata is not up-to-date.
            </para>
            <para>
              Normally &repmgr; expects its metadata (stored in the <varname>repmgr.nodes</varname>
              table) to be up-to-date so &repmgrd; can take the correct action during a failover.
              However it's possible that updates made on the primary may not
              have propagated to the standby (promotion candidate). In this case &repmgrd; will
              default to not promoting the standby. This behaviour can be overridden by setting
              <option>always_promote</option> to <literal>true</literal>.
            </para>
          </listitem>
        </varlistentry>


        <varlistentry>

          <term><option>standby_disconnect_on_failover</option></term>
          <listitem>
            <indexterm>
              <primary>standby_disconnect_on_failover</primary>
            </indexterm>

            <para>
              In a failover situation, disconnect the local node's WAL receiver.
            </para>
            <para>
              This option is available from PostgreSQL 9.5 and later.
            </para>
            <note>
              <para>
                This option <emphasis>must</emphasis> be identically configured
                on all nodes.
              </para>
              <para>
                Additionally the &repmgr; user <emphasis>must</emphasis> be a superuser
                for this option.
              </para>
              <para>
                &repmgrd; will refuse to start if this option is set
                but either of these prerequisites is not met.
              </para>
            </note>

            <para>
              See also: <link linkend="repmgrd-standby-disconnection-on-failover">Standby disconnection on failover</link>.
            </para>
          </listitem>
        </varlistentry>

      </variablelist>

      <para>
        The following options can be used to further fine-tune failover behaviour.
        In practice it's unlikely these will need to be changed from their default
        values, but are available as configuration options should the need arise.
      </para>
      <variablelist>

        <varlistentry>
          <term><option>election_rerun_interval</option></term>
          <listitem>
            <indexterm>
              <primary>election_rerun_interval</primary>
            </indexterm>

			<para>
			  If <option>failover_validation_command</option> is set, and the command returns
			  an error, pause the specified amount of seconds (default: 15) before rerunning the election.
			</para>
		  </listitem>
		</varlistentry>


        <varlistentry>
          <term><option>sibling_nodes_disconnect_timeout</option></term>
          <listitem>
            <indexterm>
              <primary>sibling_nodes_disconnect_timeout</primary>
            </indexterm>

			<para>
              If <option>standby_disconnect_on_failover</option> is <literal>true</literal>, the
              maximum length of time (in seconds, default: <literal>30</literal>)
			  to wait for other standbys to confirm they have disconnected their
		      WAL receivers.
			</para>
		  </listitem>
		</varlistentry>
      </variablelist>


    </sect2>


    <sect2 id="repmgrd-automatic-failover-configuration-pgbouncer-fencing">
      <title>Configuring &repmgrd; and pgbouncer to fence a failed primary node</title>
      <indexterm>
        <primary>fencing</primary>
        <secondary>using repmgrd and pgbouncer to fence a failed primary node</secondary>
      </indexterm>
      <indexterm>
        <primary>PgBouncer</primary>
        <secondary>using repmgrd and pgbouncer to fence a failed primary node</secondary>
      </indexterm>
      <para>
        For further details and a reference implementation, see the separate document
        <ulink url="https://github.com/2ndQuadrant/repmgr/blob/master/doc/repmgrd-node-fencing.md">Fencing a failed master node with repmgrd and PgBouncer</ulink>.
      </para>
    </sect2>

    <sect2 id="postgresql-service-configuration">
      <title>PostgreSQL service configuration</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>PostgreSQL service configuration</secondary>
      </indexterm>
      <para>
        If using automatic failover, currently &repmgrd; will need to execute
        <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
        to restart PostgreSQL on standbys to have them follow a new primary.
      </para>
      <para>
        To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
        command appropriate to your operating system via <varname>service_restart_command</varname>
        in <filename>repmgr.conf</filename>. If you don't do this, &repmgrd;
        will default to using <command>pg_ctl</command>, which can result in unexpected problems,
        particularly on <application>systemd</application>-based systems.
      </para>
      <para>
        For more details, see <xref linkend="configuration-file-service-commands"/>.
      </para>
    </sect2>

    <sect2 id="repmgrd-service-configuration">
      <title>repmgrd service configuration</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>repmgrd service configuration</secondary>
      </indexterm>
      <para>
        If you are intending to use the <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link>
        and <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link>
        commands, the following
        parameters <emphasis>must</emphasis> be set in <filename>repmgr.conf</filename>:
        <itemizedlist spacing="compact" mark="bullet">

          <listitem>
            <simpara><varname>repmgrd_service_start_command</varname></simpara>
          </listitem>

          <listitem>
            <simpara><varname>repmgrd_service_stop_command</varname></simpara>
          </listitem>

        </itemizedlist>

      </para>
      <para>
        Example (for &repmgr; with PostgreSQL 12 on CentOS 7):
        <programlisting>
repmgrd_service_start_command='sudo systemctl repmgr12 start'
repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
</programlisting>
      </para>
      <para>
        For more details see the reference page for each command.
      </para>
    </sect2>


    <sect2 id="repmgrd-monitoring-configuration" xreflabel="repmgrd monitoring configuration">
      <title>Monitoring configuration</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>monitoring configuration</secondary>
      </indexterm>
      <para>
        To enable monitoring, set:
        <programlisting>
          monitoring_history=yes</programlisting>
        in <filename>repmgr.conf</filename>.
      </para>
      <para>
        Monitoring data is written at the interval defined by
        the option <option>monitor_interval_secs</option> (see above).
      </para>
      <para>
        For more details on monitoring, see <xref linkend="repmgrd-monitoring"/>. For information on
        monitoring standby disconnections, see <xref linkend="repmgrd-primary-child-disconnection"/>.
      </para>
    </sect2>

    <sect2 id="repmgrd-reloading-configuration" xreflabel="reloading repmgrd configuration">
      <title>Applying configuration changes to repmgrd</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>applying configuration changes</secondary>
      </indexterm>
      <para>
        To apply configuration file changes to a running &repmgrd;
        daemon, execute the operating system's &repmgrd; service reload command
        (see <xref linkend="appendix-packages"/> for examples),
          or for instances  which were manually started, execute <command>kill -HUP</command>, e.g.
          <command>kill -HUP `cat /tmp/repmgrd.pid`</command>.
      </para>
      <tip>
        <para>
          Check the &repmgrd; log to see what changes were
          applied, or if any issues were encountered when reloading the configuration.
        </para>
      </tip>
      <para>
        Note that only the following subset of configuration file parameters can be changed on a
        running &repmgrd; daemon:
      </para>
      <itemizedlist spacing="compact" mark="bullet">

        <listitem>
          <simpara>
            <varname>async_query_timeout</varname>
          </simpara>
        </listitem>

         <listitem>
          <simpara>
            <varname>child_nodes_check_interval</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>child_nodes_connected_include_witness</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>child_nodes_connected_min_count</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>child_nodes_disconnect_command</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>child_nodes_disconnect_min_count</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>child_nodes_disconnect_timeout</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>connection_check_type</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>conninfo</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>degraded_monitoring_timeout</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>event_notification_command</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>event_notifications</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>failover_validation_command</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>failover</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>follow_command</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>log_facility</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>log_file</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>log_level</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>log_status_interval</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>monitor_interval_secs</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>monitoring_history</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>primary_notification_timeout</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>primary_visibility_consensus</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>always_promote</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>promote_command</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>reconnect_attempts</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>reconnect_interval</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>retry_promote_interval_secs</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>repmgrd_standby_startup_timeout</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>sibling_nodes_disconnect_timeout</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>standby_disconnect_on_failover</varname>
          </simpara>
        </listitem>

      </itemizedlist>

      <para>
        The following set of configuration file parameters must be updated via
        <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
        as they require changes to the <literal>repmgr.nodes</literal> table so they are visible to
        all nodes in the replication cluster:
      </para>
      <itemizedlist spacing="compact" mark="bullet">

        <listitem>
          <simpara>
            <varname>node_id</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>node_name</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>data_directory</varname>
          </simpara>
        </listitem>

        <listitem>
          <simpara>
            <varname>location</varname>
          </simpara>
        </listitem>


        <listitem>
          <simpara>
            <varname>priority</varname>
          </simpara>
        </listitem>

      </itemizedlist>

      <note>
        <para>
          After executing <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
          &repmgrd; <emphasis>must</emphasis> be restarted for the changes to take effect.
        </para>
      </note>

    </sect2>

  </sect1>

  <sect1 id="repmgrd-daemon" xreflabel="repmgrd daemon">
    <title>repmgrd daemon</title>

    <indexterm>
      <primary>repmgrd</primary>
      <secondary>starting and stopping</secondary>
    </indexterm>
    <para>
      If installed from a package, the &repmgrd; can be started
      via the operating system's service command, e.g. in <application>systemd</application>
      using <command>systemctl</command>.
    </para>
    <para>
      See appendix <xref linkend="appendix-packages"/> for details of service commands
      for different distributions.
    </para>
    <para>
      The commands <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link> and
      <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link> can be used
      as convenience wrappers to start and stop &repmgrd; on the local node.
    </para>
    <important>
      <para>
        <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link> and
        <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link> require
        that the appropriate start/stop commands are configured as
        <varname>repmgrd_service_start_command</varname> and <varname>repmgrd_service_stop_command</varname>
        in <filename>repmgr.conf</filename>.
      </para>
    </important>
    <para>
      &repmgrd; can be started manually like this:
      <programlisting>
        repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid</programlisting>
      and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
    </para>

    <sect2 id="repmgrd-pid-file" xreflabel="repmgrd's PID file">
      <title>repmgrd's PID file</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>PID file</secondary>
      </indexterm>
      <indexterm>
        <primary>PID file</primary>
        <secondary>repmgrd</secondary>
      </indexterm>
      <para>
        &repmgrd; will generate a PID file by default.
      </para>
      <note>
        <simpara>
          This is a behaviour change from previous versions (earlier than 4.1), where
          the PID file had to be explicitly specified with the command line
          parameter <option>--pid-file</option>.
        </simpara>
      </note>
      <para>
        The PID file can be specified in <filename>repmgr.conf</filename> with the configuration
        parameter <varname>repmgrd_pid_file</varname>.
      </para>
      <para>
        It can also be specified on the command line (as in previous versions) with
        the command line parameter <option>--pid-file</option>. Note this will override
        any value set in <filename>repmgr.conf</filename> with <varname>repmgrd_pid_file</varname>.
        <option>--pid-file</option> may be deprecated in future releases.
      </para>
      <para>
        If a PID file location was specified by the package maintainer, &repmgrd;
        will use that. This only applies if &repmgr; was installed from a package and the package
        maintainer has specified the PID file location.
      </para>
      <para>
        If none of the above apply, &repmgrd; will create a PID file
        in the operating system's temporary directory (as setermined by the environment variable
        <varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
      </para>
      <para>
        To prevent a PID file being generated at all, provide the command line option
        <option>--no-pid-file</option>.
      </para>
      <para>
        To see which PID file &repmgrd; would use, execute &repmgrd;
        with the option <option>--show-pid-file</option>. &repmgrd;
        will not start if this option is provided. Note that the value shown is the
        file  &repmgrd; would use next time it starts, and is
        not necessarily the PID file currently in use.
      </para>
    </sect2>

    <sect2 id="repmgrd-configuration-debian-ubuntu">
      <title>repmgrd daemon configuration on Debian/Ubuntu</title>

      <indexterm>
        <primary>repmgrd</primary>
        <secondary>Debian/Ubuntu and daemon configuration</secondary>
      </indexterm>
      <indexterm>
        <primary>Debian/Ubuntu</primary>
        <secondary>repmgrd daemon configuration</secondary>
      </indexterm>

      <para>
        If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
        is required before &repmgrd; is started as a daemon.
      </para>
      <para>
        This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
        looks like this:
        <programlisting>
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd

# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=no

# configuration file (required)
#REPMGRD_CONF="/path/to/repmgr.conf"

# additional options
REPMGRD_OPTS="--daemonize=false"

# user to run repmgrd as
#REPMGRD_USER=postgres

# repmgrd binary
#REPMGRD_BIN=/usr/bin/repmgrd

# pid file
#REPMGRD_PIDFILE=/var/run/repmgrd.pid</programlisting>
      </para>
      <para>
        Set <varname>REPMGRD_ENABLED</varname> to <literal>yes</literal>, and <varname>REPMGRD_CONF</varname>
        to the <filename>repmgr.conf</filename> file you are using.
      </para>
      <tip>
        <para>
          See <xref linkend="packages-debian-ubuntu"/> for details of the Debian/Ubuntu packages and
          typical file locations (including <filename>repmgr.conf</filename>).
        </para>
      </tip>
      <para>
        From &repmgrd; 4.1, ensure <varname>REPMGRD_OPTS</varname> includes
        <option>--daemonize=false</option>, as daemonization is handled by the service command.
      </para>
      <para>
        If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
        Also, if you attempted to start &repmgrd; using <command>systemctl start repmgrd</command>,
        you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
        rolls.
      </para>

    </sect2>
  </sect1>

  <sect1 id="repmgrd-connection-settings">
    <title>repmgrd connection settings</title>
 <para>
  In addition to the &repmgr; configuration settings, parameters in the
  <varname>conninfo</varname> string influence how &repmgr; makes a network connection to
  PostgreSQL. In particular, if another server in the replication cluster
  is unreachable at network level, system network settings will influence
  the length of time it takes to determine that the connection is not possible.
 </para>
 <para>
  In particular explicitly setting a parameter for <literal>connect_timeout</literal>
  should be considered; the effective minimum value of <literal>2</literal>
  (seconds) will ensure that a connection failure at network level is reported
  as soon as possible, otherwise depending on the system settings (e.g.
  <varname>tcp_syn_retries</varname> in Linux) a delay of a minute or more
  is possible.
 </para>
 <para>
  For further details on <varname>conninfo</varname> network connection
  parameters, see the
  <ulink url="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>.
 </para>
 </sect1>


  <sect1 id="repmgrd-log-rotation">
     <title>repmgrd log rotation</title>

   <indexterm>
     <primary>log rotation</primary>
     <secondary>repmgrd</secondary>
   </indexterm>

   <indexterm>
     <primary>repmgrd</primary>
     <secondary>log rotation</secondary>
   </indexterm>

  <para>
   To ensure the current &repmgrd; logfile
   (specified in <filename>repmgr.conf</filename> with the parameter
   <option>log_file</option>) does not grow indefinitely, configure your
   system's <command>logrotate</command> to regularly rotate it.
  </para>
  <para>
   Sample configuration to rotate logfiles weekly with retention for
   up to 52 weeks and rotation forced if a file grows beyond 100Mb:
   <programlisting>
    /var/log/repmgr/repmgrd.log {
        missingok
        compress
        rotate 52
        maxsize 100M
        weekly
        create 0600 postgres postgres
        postrotate
            /usr/bin/killall -HUP repmgrd
        endscript
    }</programlisting>
  </para>

 </sect1>
</chapter>