1<refentry id="repmgr-cluster-matrix">
2  <indexterm>
3    <primary>repmgr cluster matrix</primary>
4  </indexterm>
5
6  <refmeta>
7    <refentrytitle>repmgr cluster matrix</refentrytitle>
8  </refmeta>
9
10  <refnamediv>
11    <refname>repmgr cluster matrix</refname>
12    <refpurpose>
13      runs repmgr cluster show on each node and summarizes output
14    </refpurpose>
15  </refnamediv>
16
17  <refsect1>
18    <title>Description</title>
19    <para>
20      <command>repmgr cluster matrix</command> runs <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command> on each
21      node and arranges the results in a matrix, recording success or failure.
22    </para>
23    <para>
24      <command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
25      file on each node. Additionally, passwordless <command>ssh</command> connections are required between
26      all nodes.
27    </para>
28  </refsect1>
29
30  <refsect1>
31    <title>Example</title>
32    <para>
33    Example 1 (all nodes up):
34    <programlisting>
35    $ repmgr -f /etc/repmgr.conf cluster matrix
36
37    Name   | Id |  1 |  2 |  3
38    -------+----+----+----+----
39     node1 |  1 |  * |  * |  *
40     node2 |  2 |  * |  * |  *
41     node3 |  3 |  * |  * |  *</programlisting>
42  </para>
43  <para>
44    Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
45    <programlisting>
46    $ repmgr -f /etc/repmgr.conf cluster matrix
47
48    Name   | Id |  1 |  2 |  3
49    -------+----+----+----+----
50     node1 |  1 |  * |  * |  x
51     node2 |  2 |  * |  * |  x
52     node3 |  3 |  ? |  ? |  ?
53    </programlisting>
54  </para>
55  <para>
56   Each row corresponds to one server, and indicates the result of
57   testing an outbound connection from that server.
58  </para>
59  <para>
60    Since <literal>node3</literal> is down, all the entries in its row are filled with
61    <literal>?</literal>, meaning that there we cannot test outbound connections.
62  </para>
63  <para>
64    The other two nodes are up; the corresponding rows have <literal>x</literal> in the
65    column corresponding to <literal>node3</literal>, meaning that inbound connections to
66    that node have failed, and <literal>*</literal> in the columns corresponding to
67    <literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
68    to these nodes have succeeded.
69  </para>
70  <para>
71    Example 3 (all nodes up, firewall dropping packets originating
72    from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
73    running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
74    <programlisting>
75    $ repmgr -f /etc/repmgr.conf cluster matrix
76
77    Name   | Id |  1 |  2 |  3
78    -------+----+----+----+----
79     node1 |  1 |  * |  * |  x
80     node2 |  2 |  * |  * |  *
81     node3 |  3 |  ? |  ? |  ?</programlisting>
82  </para>
83  <para>
84    Note this may take some time depending on the <varname>connect_timeout</varname>
85    setting in the node <varname>conninfo</varname> strings; default is
86    <literal>1 minute</literal> which means without modification the above
87    command would take around 2 minutes to run; see comment elsewhere about setting
88    <varname>connect_timeout</varname>)
89  </para>
90  <para>
91   The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
92   and that (therefore) we don't know the state of any outbound
93   connection from <literal>node3</literal>.
94  </para>
95  <para>
96    In this case, the <xref linkend="repmgr-cluster-crosscheck"/> command will produce a more
97    useful result.
98  </para>
99  </refsect1>
100
101
102  <refsect1>
103    <title>Exit codes</title>
104    <para>
105      One of the following exit codes will be emitted by <command>repmgr cluster matrix</command>:
106    </para>
107    <variablelist>
108
109      <varlistentry>
110        <term><option>SUCCESS (0)</option></term>
111        <listitem>
112          <para>
113            The check completed successfully and all nodes are reachable.
114          </para>
115        </listitem>
116      </varlistentry>
117
118      <varlistentry>
119        <term><option>ERR_BAD_SSH (12)</option></term>
120        <listitem>
121          <para>
122            One or more nodes could not be accessed via SSH.
123          </para>
124        </listitem>
125      </varlistentry>
126
127      <varlistentry>
128        <term><option>ERR_NODE_STATUS (25)</option></term>
129        <listitem>
130          <para>
131            PostgreSQL on one or more nodes could not be reached.
132          </para>
133          <note>
134            <simpara>
135              This error code overrides <option>ERR_BAD_SSH</option>.
136            </simpara>
137          </note>
138        </listitem>
139      </varlistentry>
140
141   </variablelist>
142  </refsect1>
143
144</refentry>
145
146