1---
2stage: Enablement
3group: Geo
4info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
5type: howto
6---
7
8# Geo database replication **(PREMIUM SELF)**
9
10This document describes the minimal required steps to replicate your primary
11GitLab database to a secondary node's database. You may have to change some
12values, based on attributes including your database's setup and size.
13
14NOTE:
15If your GitLab installation uses external (not managed by Omnibus GitLab)
16PostgreSQL instances, the Omnibus roles cannot perform all necessary
17configuration steps. In this case, use the [Geo with external PostgreSQL instances](external_database.md)
18process instead.
19
20The stages of the setup process must be completed in the documented order.
21Before you attempt the steps in this stage, [complete all prior stages](../setup/index.md#using-omnibus-gitlab).
22
23Be sure to read and review all of these steps before you execute them in your
24testing or production environments.
25
26## Single instance database replication
27
28A single instance database replication is easier to set up and still provides the same Geo capabilities
29as a clusterized alternative. It's useful for setups running on a single machine
30or trying to evaluate Geo for a future clusterized installation.
31
32A single instance can be expanded to a clusterized version using Patroni, which is recommended for a
33highly available architecture.
34
35Follow below the instructions on how to set up PostgreSQL replication as a single instance database.
36Alternatively, you can look at the [Multi-node database replication](#multi-node-database-replication)
37instructions on setting up replication with a Patroni cluster.
38
39### PostgreSQL replication
40
41The GitLab **primary** node where the write operations happen connects to
42the **primary** database server, and **secondary** nodes
43connect to their own database servers (which are also read-only).
44
45We recommend using [PostgreSQL replication slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
46to ensure that the **primary** node retains all the data necessary for the **secondary** nodes to
47recover. See below for more details.
48
49The following guide assumes that:
50
51- You are using Omnibus and therefore you are using PostgreSQL 12 or later
52  which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/12/app-pgbasebackup.html).
53- You have a **primary** node already set up (the GitLab server you are
54  replicating from), running Omnibus' PostgreSQL (or equivalent version), and
55  you have a new **secondary** server set up with the same versions of the OS,
56  PostgreSQL, and GitLab on all nodes.
57
58WARNING:
59Geo works with streaming replication. Logical replication is not supported at this time.
60There is an [issue where support is being discussed](https://gitlab.com/gitlab-org/gitlab/-/issues/7420).
61
62#### Step 1. Configure the **primary** server
63
641. SSH into your GitLab **primary** server and login as root:
65
66   ```shell
67   sudo -i
68   ```
69
701. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your site:
71
72   ```ruby
73   ##
74   ## The unique identifier for the Geo site. See
75   ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
76   ##
77   gitlab_rails['geo_node_name'] = '<site_name_here>'
78   ```
79
801. Reconfigure the **primary** node for the change to take effect:
81
82   ```shell
83   gitlab-ctl reconfigure
84   ```
85
861. Execute the command below to define the node as **primary** node:
87
88   ```shell
89   gitlab-ctl set-geo-primary-node
90   ```
91
92   This command uses your defined `external_url` in `/etc/gitlab/gitlab.rb`.
93
941. Define a password for the `gitlab` database user:
95
96   Generate a MD5 hash of the desired password:
97
98   ```shell
99   gitlab-ctl pg-password-md5 gitlab
100   # Enter password: <your_password_here>
101   # Confirm password: <your_password_here>
102   # fca0b89a972d69f00eb3ec98a5838484
103   ```
104
105   Edit `/etc/gitlab/gitlab.rb`:
106
107   ```ruby
108   # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab`
109   postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
110
111   # Every node that runs Puma or Sidekiq needs to have the database
112   # password specified as below. If you have a high-availability setup, this
113   # must be present in all application nodes.
114   gitlab_rails['db_password'] = '<your_password_here>'
115   ```
116
1171. Define a password for the database [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication).
118
119   We will use the username defined in `/etc/gitlab/gitlab.rb` under the `postgresql['sql_replication_user']`
120   setting. The default value is `gitlab_replicator`, but if you changed it to something else, adapt
121   the instructions below.
122
123   Generate a MD5 hash of the desired password:
124
125   ```shell
126   gitlab-ctl pg-password-md5 gitlab_replicator
127   # Enter password: <your_password_here>
128   # Confirm password: <your_password_here>
129   # 950233c0dfc2f39c64cf30457c3b7f1e
130   ```
131
132   Edit `/etc/gitlab/gitlab.rb`:
133
134   ```ruby
135   # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab_replicator`
136   postgresql['sql_replication_password'] = '<md5_hash_of_your_password>'
137   ```
138
139   If you are using an external database not managed by Omnibus GitLab, you need
140   to create the replicator user and define a password to it manually:
141
142   ```sql
143   --- Create a new user 'replicator'
144   CREATE USER gitlab_replicator;
145
146   --- Set/change a password and grants replication privilege
147   ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD '<replication_password>';
148   ```
149
1501. Configure PostgreSQL to listen on network interfaces:
151
152   For security reasons, PostgreSQL does not listen on any network interfaces
153   by default. However, Geo requires the **secondary** node to be able to
154   connect to the **primary** node's database. For this reason, we need the address of
155   each node.
156
157   NOTE:
158   For external PostgreSQL instances, see [additional instructions](external_database.md).
159
160   If you are using a cloud provider, you can lookup the addresses for each
161   Geo node through your cloud provider's management console.
162
163   To lookup the address of a Geo node, SSH in to the Geo node and execute:
164
165   ```shell
166   ##
167   ## Private address
168   ##
169   ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}'
170
171   ##
172   ## Public address
173   ##
174   echo "External address: $(curl --silent "ipinfo.io/ip")"
175   ```
176
177   In most cases, the following addresses are used to configure GitLab
178   Geo:
179
180   | Configuration                           | Address                                               |
181   |:----------------------------------------|:------------------------------------------------------|
182   | `postgresql['listen_address']`          | **Primary** node's public or VPC private address.     |
183   | `postgresql['md5_auth_cidr_addresses']` | **Secondary** node's public or VPC private addresses. |
184
185   If you are using Google Cloud Platform, SoftLayer, or any other vendor that
186   provides a virtual private cloud (VPC) you can use the **primary** and **secondary** nodes
187   private addresses (corresponds to "internal address" for Google Cloud Platform) for
188   `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
189
190   The `listen_address` option opens PostgreSQL up to network connections with the interface
191   corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/12/runtime-config-connection.html)
192   for more details.
193
194   NOTE:
195   If you need to use `0.0.0.0` or `*` as the listen_address, you also need to add
196   `127.0.0.1/32` to the `postgresql['md5_auth_cidr_addresses']` setting, to allow Rails to connect through
197   `127.0.0.1`. For more information, see [omnibus-5258](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5258).
198
199   Depending on your network configuration, the suggested addresses may not
200   be correct. If your **primary** node and **secondary** nodes connect over a local
201   area network, or a virtual network connecting availability zones like
202   [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/)
203   you should use the **secondary** node's private address for `postgresql['md5_auth_cidr_addresses']`.
204
205   Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
206   addresses with addresses appropriate to your network configuration:
207
208   ```ruby
209   ##
210   ## Geo Primary role
211   ## - Configures Postgres settings for replication
212   ## - Prevents automatic upgrade of Postgres since it requires downtime of
213   ##   streaming replication to Geo secondary sites
214   ## - Enables standard single-node GitLab services like NGINX, Puma, Redis,
215   ##   or Sidekiq. If you are segregating services, then you will need to
216   ##   explicitly disable unwanted services.
217   ##
218   roles(['geo_primary_role'])
219
220   ##
221   ## Primary address
222   ## - replace '<primary_node_ip>' with the public or VPC address of your Geo primary node
223   ##
224   postgresql['listen_address'] = '<primary_node_ip>'
225
226   ##
227   # Allow PostgreSQL client authentication from the primary and secondary IPs. These IPs may be
228   # public or VPC addresses in CIDR format, for example ['198.51.100.1/32', '198.51.100.2/32']
229   ##
230   postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32']
231
232   ##
233   ## Replication settings
234   ## - set this to be the number of Geo secondary nodes you have
235   ##
236   postgresql['max_replication_slots'] = 1
237   # postgresql['max_wal_senders'] = 10
238   # postgresql['wal_keep_segments'] = 10
239
240   ##
241   ## Disable automatic database migrations temporarily
242   ## (until PostgreSQL is restarted and listening on the private address).
243   ##
244   gitlab_rails['auto_migrate'] = false
245   ```
246
2471. Optional: If you want to add another **secondary** node, the relevant setting would look like:
248
249   ```ruby
250   postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32', '<another_secondary_node_ip>/32']
251   ```
252
253   You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your
254   database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/12/runtime-config-replication.html)
255   for more information.
256
2571. Save the file and reconfigure GitLab for the database listen changes and
258   the replication slot changes to be applied:
259
260   ```shell
261   gitlab-ctl reconfigure
262   ```
263
264   Restart PostgreSQL for its changes to take effect:
265
266   ```shell
267   gitlab-ctl restart postgresql
268   ```
269
2701. Re-enable migrations now that PostgreSQL is restarted and listening on the
271   private address.
272
273   Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`:
274
275   ```ruby
276   gitlab_rails['auto_migrate'] = true
277   ```
278
279   Save the file and reconfigure GitLab:
280
281   ```shell
282   gitlab-ctl reconfigure
283   ```
284
2851. Now that the PostgreSQL server is set up to accept remote connections, run
286   `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port
287   `5432` to the **primary** server's private address.
288
2891. A certificate was automatically generated when GitLab was reconfigured. This
290   is used automatically to protect your PostgreSQL traffic from
291   eavesdroppers, but to protect against active ("man-in-the-middle") attackers,
292   the **secondary** node needs a copy of the certificate. Make a copy of the PostgreSQL
293   `server.crt` file on the **primary** node by running this command:
294
295   ```shell
296   cat ~gitlab-psql/data/server.crt
297   ```
298
299   Copy the output into a clipboard or into a local file. You
300   need it when setting up the **secondary** node! The certificate is not sensitive
301   data.
302
303#### Step 2. Configure the **secondary** server
304
3051. SSH into your GitLab **secondary** server and login as root:
306
307   ```shell
308   sudo -i
309   ```
310
3111. Stop application server and Sidekiq
312
313   ```shell
314   gitlab-ctl stop puma
315   gitlab-ctl stop sidekiq
316   ```
317
318   NOTE:
319   This step is important so we don't try to execute anything before the node is fully configured.
320
3211. [Check TCP connectivity](../../raketasks/maintenance.md) to the **primary** node's PostgreSQL server:
322
323   ```shell
324   gitlab-rake gitlab:tcp_check[<primary_node_ip>,5432]
325   ```
326
327   NOTE:
328   If this step fails, you may be using the wrong IP address, or a firewall may
329   be preventing access to the server. Check the IP address, paying close
330   attention to the difference between public and private addresses and ensure
331   that, if a firewall is present, the **secondary** node is permitted to connect to the
332   **primary** node on port 5432.
333
3341. Create a file `server.crt` in the **secondary** server, with the content you got on the last step of the **primary** node's setup:
335
336   ```shell
337   editor server.crt
338   ```
339
3401. Set up PostgreSQL TLS verification on the **secondary** node:
341
342   Install the `server.crt` file:
343
344   ```shell
345   install \
346      -D \
347      -o gitlab-psql \
348      -g gitlab-psql \
349      -m 0400 \
350      -T server.crt ~gitlab-psql/.postgresql/root.crt
351   ```
352
353   PostgreSQL now only recognizes that exact certificate when verifying TLS
354   connections. The certificate can only be replicated by someone with access
355   to the private key, which is **only** present on the **primary** node.
356
3571. Test that the `gitlab-psql` user can connect to the **primary** node's database
358   (the default Omnibus database name is `gitlabhq_production`):
359
360   ```shell
361   sudo \
362      -u gitlab-psql /opt/gitlab/embedded/bin/psql \
363      --list \
364      -U gitlab_replicator \
365      -d "dbname=gitlabhq_production sslmode=verify-ca" \
366      -W \
367      -h <primary_node_ip>
368   ```
369
370   When prompted enter the password you set in the first step for the
371   `gitlab_replicator` user. If all worked correctly, you should see
372   the list of **primary** node's databases.
373
374   A failure to connect here indicates that the TLS configuration is incorrect.
375   Ensure that the contents of `~gitlab-psql/data/server.crt` on the **primary** node
376   match the contents of `~gitlab-psql/.postgresql/root.crt` on the **secondary** node.
377
3781. Configure PostgreSQL:
379
380   This step is similar to how we configured the **primary** instance.
381   We need to enable this, even if using a single node.
382
383   Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
384   addresses with addresses appropriate to your network configuration:
385
386   ```ruby
387   ##
388   ## Geo Secondary role
389   ## - configure dependent flags automatically to enable Geo
390   ##
391   roles(['geo_secondary_role'])
392
393   ##
394   ## Secondary address
395   ## - replace '<secondary_node_ip>' with the public or VPC address of your Geo secondary node
396   ##
397   postgresql['listen_address'] = '<secondary_node_ip>'
398   postgresql['md5_auth_cidr_addresses'] = ['<secondary_node_ip>/32']
399
400   ##
401   ## Database credentials password (defined previously in primary node)
402   ## - replicate same values here as defined in primary node
403   ##
404   postgresql['sql_replication_password'] = '<md5_hash_of_your_password>'
405   postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
406   gitlab_rails['db_password'] = '<your_password_here>'
407   ```
408
409   For external PostgreSQL instances, see [additional instructions](external_database.md).
410   If you bring a former **primary** node back online to serve as a **secondary** node, then you also need to remove `roles(['geo_primary_role'])` or `geo_primary_role['enable'] = true`.
411
4121. Reconfigure GitLab for the changes to take effect:
413
414   ```shell
415   gitlab-ctl reconfigure
416   ```
417
4181. Restart PostgreSQL for the IP change to take effect:
419
420   ```shell
421   gitlab-ctl restart postgresql
422   ```
423
424#### Step 3. Initiate the replication process
425
426Below we provide a script that connects the database on the **secondary** node to
427the database on the **primary** node, replicates the database, and creates the
428needed files for streaming replication.
429
430The directories used are the defaults that are set up in Omnibus. If you have
431changed any defaults, configure it as you see fit replacing the directories and paths.
432
433WARNING:
434Make sure to run this on the **secondary** server as it removes all PostgreSQL's
435data before running `pg_basebackup`.
436
4371. SSH into your GitLab **secondary** server and login as root:
438
439   ```shell
440   sudo -i
441   ```
442
4431. Choose a database-friendly name to use for your **secondary** node to
444   use as the replication slot name. For example, if your domain is
445   `secondary.geo.example.com`, you may use `secondary_example` as the slot
446   name as shown in the commands below.
447
4481. Execute the command below to start a backup/restore and begin the replication
449
450   WARNING:
451   Each Geo **secondary** node must have its own unique replication slot name.
452   Using the same slot name between two secondaries breaks PostgreSQL replication.
453
454   ```shell
455   gitlab-ctl replicate-geo-database \
456      --slot-name=<secondary_node_name> \
457      --host=<primary_node_ip>
458   ```
459
460   NOTE:
461   Replication slot names must only contain lowercase letters, numbers, and the underscore character.
462
463   When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator`
464   user in the first step.
465
466   This command also takes a number of additional options. You can use `--help`
467   to list them all, but here are a couple of tips:
468
469   - If PostgreSQL is listening on a non-standard port, add `--port=` as well.
470   - If your database is too large to be transferred in 30 minutes, you need
471     to increase the timeout, for example, `--backup-timeout=3600` if you expect the
472     initial replication to take under an hour.
473   - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
474     (for example, you know the network path is secure, or you are using a site-to-site
475     VPN). This is **not** safe over the public Internet!
476   - You can read more details about each `sslmode` in the
477     [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
478     the instructions above are carefully written to ensure protection against
479     both passive eavesdroppers and active "man-in-the-middle" attackers.
480   - Change the `--slot-name` to the name of the replication slot
481     to be used on the **primary** database. The script attempts to create the
482     replication slot automatically if it does not exist.
483   - If you're repurposing an old server into a Geo **secondary** node, you need to
484     add `--force` to the command line.
485   - When not in a production machine you can disable backup step if you
486     really sure this is what you want by adding `--skip-backup`
487
488The replication process is now complete.
489
490### PgBouncer support (optional)
491
492[PgBouncer](https://www.pgbouncer.org/) may be used with GitLab Geo to pool
493PostgreSQL connections, which can improve performance even when using in a
494single instance installation.
495
496We recommend using PgBouncer if you use GitLab in a highly available
497configuration with a cluster of nodes supporting a Geo **primary** site and
498two other clusters of nodes supporting a Geo **secondary** site. One for the
499main database and the other for the tracking database. For more information,
500see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
501
502### Changing the replication password
503
504To change the password for the [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
505when using Omnibus-managed PostgreSQL instances:
506
507On the GitLab Geo **primary** server:
508
5091. The default value for the replication user is `gitlab_replicator`, but if you've set a custom replication
510   user in your `/etc/gitlab/gitlab.rb` under the `postgresql['sql_replication_user']` setting, make sure to
511   adapt the following instructions for your own user.
512
513   Generate an MD5 hash of the desired password:
514
515   ```shell
516   sudo gitlab-ctl pg-password-md5 gitlab_replicator
517   # Enter password: <your_password_here>
518   # Confirm password: <your_password_here>
519   # 950233c0dfc2f39c64cf30457c3b7f1e
520   ```
521
522   Edit `/etc/gitlab/gitlab.rb`:
523
524   ```ruby
525   # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab_replicator`
526   postgresql['sql_replication_password'] = '<md5_hash_of_your_password>'
527   ```
528
5291. Save the file and reconfigure GitLab to change the replication user's password in PostgreSQL:
530
531   ```shell
532   sudo gitlab-ctl reconfigure
533   ```
534
5351. Restart PostgreSQL for the replication password change to take effect:
536
537   ```shell
538   sudo gitlab-ctl restart postgresql
539   ```
540
541Until the password is updated on any **secondary** servers, the [PostgreSQL log](../../logs.md#postgresql-logs) on
542the secondaries will report the following error message:
543
544```console
545FATAL:  could not connect to the primary server: FATAL:  password authentication failed for user "gitlab_replicator"
546```
547
548On all GitLab Geo **secondary** servers:
549
5501. The first step isn't necessary from a configuration perspective, since the hashed `'sql_replication_password'`
551   is not used on the GitLab Geo **secondary**. However in the event that **secondary** needs to be promoted
552   to the GitLab Geo **primary**, make sure to match the `'sql_replication_password'` in the secondary
553   server configuration.
554
555   Edit `/etc/gitlab/gitlab.rb`:
556
557   ```ruby
558   # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab_replicator` on the Geo primary
559   postgresql['sql_replication_password'] = '<md5_hash_of_your_password>'
560   ```
561
5621. During the initial replication setup, the `gitlab-ctl replicate-geo-database` command writes the plaintext
563   password for the replication user account to two locations:
564
565   - `gitlab-geo.conf`: Used by the PostgreSQL replication process, written to the PostgreSQL data
566      directory, by default at `/var/opt/gitlab/postgresql/data/gitlab-geo.conf`.
567   - `.pgpass`: Used by the `gitlab-psql` user, located by default at `/var/opt/gitlab/postgresql/.pgpass`.
568
569   Update the plaintext password in both of these files, and restart PostgreSQL:
570
571   ```shell
572   sudo gitlab-ctl restart postgresql
573   ```
574
575## Multi-node database replication
576
577In GitLab 14.0, Patroni replaced `repmgr` as the supported
578[highly available PostgreSQL solution](../../postgresql/replication_and_failover.md).
579
580NOTE:
581If you still haven't [migrated from repmgr to Patroni](#migrating-from-repmgr-to-patroni) you're highly advised to do so.
582
583### Patroni support
584
585Patroni is the official replication management solution for Geo. It
586can be used to build a highly available cluster on the **primary** and a **secondary** Geo site.
587Using Patroni on a **secondary** site is optional and you don't have to use the same amount of
588nodes on each Geo site.
589
590For instructions about how to set up Patroni on the primary site, see the
591[PostgreSQL replication and failover with Omnibus GitLab](../../postgresql/replication_and_failover.md#patroni) page.
592
593#### Configuring Patroni cluster for a Geo secondary site
594
595In a Geo secondary site, the main PostgreSQL database is a read-only replica of the primary site's PostgreSQL database.
596
597If you are currently using `repmgr` on your Geo primary site, see [these instructions](#migrating-from-repmgr-to-patroni)
598for migrating from `repmgr` to Patroni.
599
600A production-ready and secure setup requires at least:
601
602- 3 Consul nodes _(primary and secondary sites)_
603- 2 Patroni nodes _(primary and secondary sites)_
604- 1 PgBouncer node _(primary and secondary sites)_
605- 1 internal load-balancer _(primary site only)_
606
607The internal load balancer provides a single endpoint for connecting to the Patroni cluster's leader whenever a new leader is
608elected, and it is required for enabling cascading replication from the secondary sites.
609
610Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni)
611and other database best practices.
612
613##### Step 1. Configure Patroni permanent replication slot on the primary site
614
615To set up database replication with Patroni on a secondary node, we need to
616configure a _permanent replication slot_ on the primary node's Patroni cluster,
617and ensure password authentication is used.
618
619For each Patroni instance on the primary site **starting on the Patroni
620Leader instance**:
621
6221. SSH into your Patroni instance and login as root:
623
624   ```shell
625   sudo -i
626   ```
627
6281. Edit `/etc/gitlab/gitlab.rb` and add the following:
629
630   ```ruby
631   roles(['patroni_role'])
632
633   consul['services'] = %w(postgresql)
634   consul['configuration'] = {
635     retry_join: %w[CONSUL_PRIMARY1_IP CONSUL_PRIMARY2_IP CONSUL_PRIMARY3_IP]
636   }
637
638   # You need one entry for each secondary, with a unique name following PostgreSQL slot_name constraints:
639   #
640   # Configuration syntax is: 'unique_slotname' => { 'type' => 'physical' },
641   # We don't support setting a permanent replication slot for logical replication type
642   patroni['replication_slots'] = {
643     'geo_secondary' => { 'type' => 'physical' }
644   }
645
646   patroni['use_pg_rewind'] = true
647   patroni['postgresql']['max_wal_senders'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary).
648   patroni['postgresql']['max_replication_slots'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary).
649   patroni['username'] = 'PATRONI_API_USERNAME'
650   patroni['password'] = 'PATRONI_API_PASSWORD'
651   patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
652
653   # Add all patroni nodes to the allowlist
654   patroni['allowlist'] = %w[
655     127.0.0.1/32
656     PATRONI_PRIMARY1_IP/32 PATRONI_PRIMARY2_IP/32 PATRONI_PRIMARY3_IP/32
657     PATRONI_SECONDARY1_IP/32 PATRONI_SECONDARY2_IP/32 PATRONI_SECONDARY3_IP/32
658   ]
659
660   # We list all secondary instances as they can all become a Standby Leader
661   postgresql['md5_auth_cidr_addresses'] = %w[
662     PATRONI_PRIMARY1_IP/32 PATRONI_PRIMARY2_IP/32 PATRONI_PRIMARY3_IP/32 PATRONI_PRIMARY_PGBOUNCER/32
663     PATRONI_SECONDARY1_IP/32 PATRONI_SECONDARY2_IP/32 PATRONI_SECONDARY3_IP/32 PATRONI_SECONDARY_PGBOUNCER/32
664   ]
665
666   postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH'
667   postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH'
668   postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
669   postgresql['listen_address'] = '0.0.0.0' # You can use a public or VPC address here instead
670   ```
671
6721. Reconfigure GitLab for the changes to take effect:
673
674   ```shell
675   gitlab-ctl reconfigure
676   ```
677
678##### Step 2. Configure the internal load balancer on the primary site
679
680To avoid reconfiguring the Standby Leader on the secondary site whenever a new
681Leader is elected on the primary site, we need to set up a TCP internal load
682balancer which gives a single endpoint for connecting to the Patroni
683cluster's Leader.
684
685The Omnibus GitLab packages do not include a Load Balancer. Here's how you
686could do it with [HAProxy](https://www.haproxy.org/).
687
688The following IPs and names are used as an example:
689
690- `10.6.0.21`: Patroni 1 (`patroni1.internal`)
691- `10.6.0.22`: Patroni 2 (`patroni2.internal`)
692- `10.6.0.23`: Patroni 3 (`patroni3.internal`)
693
694```plaintext
695global
696    log /dev/log local0
697    log localhost local1 notice
698    log stdout format raw local0
699
700defaults
701    log global
702    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
703
704frontend internal-postgresql-tcp-in
705    bind *:5000
706    mode tcp
707    option tcplog
708
709    default_backend postgresql
710
711backend postgresql
712    option httpchk
713    http-check expect status 200
714
715    server patroni1.internal 10.6.0.21:5432 maxconn 100 check port 8008
716    server patroni2.internal 10.6.0.22:5432 maxconn 100 check port 8008
717    server patroni3.internal 10.6.0.23:5432 maxconn 100 check port 8008
718```
719
720Refer to your preferred Load Balancer's documentation for further guidance.
721
722##### Step 3. Configure a PgBouncer node on the secondary site
723
724A production-ready and highly available configuration requires at least
725three Consul nodes, a minimum of one PgBouncer node, but it's recommended to have
726one per database node. An internal load balancer (TCP) is required when there is
727more than one PgBouncer service nodes. The internal load balancer provides a single
728endpoint for connecting to the PgBouncer cluster. For more information,
729see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
730
731Follow the minimal configuration for the PgBouncer node:
732
7331. SSH into your PgBouncer node and login as root:
734
735   ```shell
736   sudo -i
737   ```
738
7391. Edit `/etc/gitlab/gitlab.rb` and add the following:
740
741   ```ruby
742   # Disable all components except Pgbouncer and Consul agent
743   roles(['pgbouncer_role'])
744
745   # PgBouncer configuration
746   pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
747   pgbouncer['users'] = {
748   'gitlab-consul': {
749      # Generate it with: `gitlab-ctl pg-password-md5 gitlab-consul`
750      password: 'GITLAB_CONSUL_PASSWORD_HASH'
751    },
752     'pgbouncer': {
753       # Generate it with: `gitlab-ctl pg-password-md5 pgbouncer`
754       password: 'PGBOUNCER_PASSWORD_HASH'
755     }
756   }
757
758   # Consul configuration
759   consul['watchers'] = %w(postgresql)
760   consul['configuration'] = {
761     retry_join: %w[CONSUL_SECONDARY1_IP CONSUL_SECONDARY2_IP CONSUL_SECONDARY3_IP]
762   }
763   consul['monitoring_service_discovery'] =  true
764   ```
765
7661. Reconfigure GitLab for the changes to take effect:
767
768   ```shell
769   gitlab-ctl reconfigure
770   ```
771
7721. Create a `.pgpass` file so Consul is able to reload PgBouncer. Enter the `PLAIN_TEXT_PGBOUNCER_PASSWORD` twice when asked:
773
774   ```shell
775   gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul
776   ```
777
7781. Reload the PgBouncer service:
779
780   ```shell
781   gitlab-ctl hup pgbouncer
782   ```
783
784##### Step 4. Configure a Standby cluster on the secondary site
785
786NOTE:
787If you are converting a secondary site to a Patroni Cluster, you must start
788on the PostgreSQL instance. It becomes the Patroni Standby Leader instance,
789and then you can switchover to another replica if you need.
790
791For each Patroni instance on the secondary site:
792
7931. SSH into your Patroni node and login as root:
794
795   ```shell
796   sudo -i
797   ```
798
7991. Edit `/etc/gitlab/gitlab.rb` and add the following:
800
801   ```ruby
802   roles(['consul_role', 'patroni_role'])
803
804   consul['enable'] = true
805   consul['configuration'] = {
806     retry_join: %w[CONSUL_SECONDARY1_IP CONSUL_SECONDARY2_IP CONSUL_SECONDARY3_IP]
807   }
808
809   postgresql['md5_auth_cidr_addresses'] = [
810     'PATRONI_SECONDARY1_IP/32', 'PATRONI_SECONDARY2_IP/32', 'PATRONI_SECONDARY3_IP/32', 'PATRONI_SECONDARY_PGBOUNCER/32',
811     # Any other instance that needs access to the database as per documentation
812   ]
813
814
815   # Add patroni nodes to the allowlist
816   patroni['allowlist'] = %w[
817     127.0.0.1/32
818     PATRONI_SECONDARY1_IP/32 PATRONI_SECONDARY2_IP/32 PATRONI_SECONDARY3_IP/32
819   ]
820
821   patroni['standby_cluster']['enable'] = true
822   patroni['standby_cluster']['host'] = 'INTERNAL_LOAD_BALANCER_PRIMARY_IP'
823   patroni['standby_cluster']['port'] = INTERNAL_LOAD_BALANCER_PRIMARY_PORT
824   patroni['standby_cluster']['primary_slot_name'] = 'geo_secondary' # Or the unique replication slot name you setup before
825   patroni['username'] = 'PATRONI_API_USERNAME'
826   patroni['password'] = 'PATRONI_API_PASSWORD'
827   patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
828   patroni['use_pg_rewind'] = true
829   patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica
830   patroni['postgresql']['max_replication_slots'] = 5 # A minimum of three for one replica, plus two for each additional replica
831
832   postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH'
833   postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH'
834   postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
835   postgresql['listen_address'] = '0.0.0.0' # You can use a public or VPC address here instead
836
837   gitlab_rails['dbpassword'] = 'POSTGRESQL_PASSWORD'
838   gitlab_rails['enable'] = true
839   gitlab_rails['auto_migrate'] = false
840   ```
841
8421. Reconfigure GitLab for the changes to take effect.
843   This is required to bootstrap PostgreSQL users and settings.
844
845   - If this is a fresh installation of Patroni:
846
847     ```shell
848     gitlab-ctl reconfigure
849     ```
850
851   - If you are configuring a Patroni standby cluster on a site that previously had a working Patroni cluster:
852
853     ```shell
854     gitlab-ctl stop patroni
855     rm -rf /var/opt/gitlab/postgresql/data
856     /opt/gitlab/embedded/bin/patronictl -c /var/opt/gitlab/patroni/patroni.yaml remove postgresql-ha
857     gitlab-ctl reconfigure
858     gitlab-ctl start patroni
859     ```
860
861### Migrating from repmgr to Patroni
862
8631. Before migrating, it is recommended that there is no replication lag between the primary and secondary sites and that replication is paused. In GitLab 13.2 and later, you can pause and resume replication with `gitlab-ctl geo-replication-pause` and `gitlab-ctl geo-replication-resume` on a Geo secondary database node.
8641. Follow the [instructions to migrate repmgr to Patroni](../../postgresql/replication_and_failover.md#switching-from-repmgr-to-patroni). When configuring Patroni on each primary site database node, add `patroni['replication_slots'] = { '<slot_name>' => 'physical' }`
865to `gitlab.rb` where `<slot_name>` is the name of the replication slot for your Geo secondary. This ensures that Patroni recognizes the replication slot as permanent and not drop it upon restarting.
8661. If database replication to the secondary was paused before migration, resume replication once Patroni is confirmed working on the primary.
867
868### Migrating a single PostgreSQL node to Patroni
869
870Before the introduction of Patroni, Geo had no Omnibus support for HA setups on the secondary node.
871
872With Patroni it's now possible to support that. In order to migrate the existing PostgreSQL to Patroni:
873
8741. Make sure you have a Consul cluster setup on the secondary (similar to how you set it up on the primary).
8751. [Configure a permanent replication slot](#step-1-configure-patroni-permanent-replication-slot-on-the-primary-site).
8761. [Configure the internal load balancer](#step-2-configure-the-internal-load-balancer-on-the-primary-site).
8771. [Configure a PgBouncer node](#step-3-configure-a-pgbouncer-node-on-the-secondary-site)
8781. [Configure a Standby Cluster](#step-4-configure-a-standby-cluster-on-the-secondary-site)
879   on that single node machine.
880
881You end up with a "Standby Cluster" with a single node. That allows you to later on add additional Patroni nodes
882by following the same instructions above.
883
884### Configuring Patroni cluster for the tracking PostgreSQL database
885
886Secondary sites use a separate PostgreSQL installation as a tracking database to
887keep track of replication status and automatically recover from potential replication issues.
888Omnibus automatically configures a tracking database when `roles(['geo_secondary_role'])` is set.
889
890If you want to run this database in a highly available configuration, don't use the `geo_secondary_role` above.
891Instead, follow the instructions below.
892
893A production-ready and secure setup requires at least three Consul nodes, two
894Patroni nodes and one PgBouncer node on the secondary site.
895
896Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni)
897and other database best practices.
898
899#### Step 1. Configure a PgBouncer node on the secondary site
900
901Follow the minimal configuration for the PgBouncer node for the tracking database:
902
9031. SSH into your PgBouncer node and login as root:
904
905   ```shell
906   sudo -i
907   ```
908
9091. Edit `/etc/gitlab/gitlab.rb` and add the following:
910
911   ```ruby
912   # Disable all components except Pgbouncer and Consul agent
913   roles(['pgbouncer_role'])
914
915   # PgBouncer configuration
916   pgbouncer['users'] = {
917     'pgbouncer': {
918       password: 'PGBOUNCER_PASSWORD_HASH'
919     }
920   }
921
922   pgbouncer['databases'] = {
923     gitlabhq_geo_production: {
924       user: 'pgbouncer',
925       password: 'PGBOUNCER_PASSWORD_HASH'
926     }
927   }
928
929   # Consul configuration
930   consul['watchers'] = %w(postgresql)
931
932   consul['configuration'] = {
933     retry_join: %w[CONSUL_TRACKINGDB1_IP CONSUL_TRACKINGDB2_IP CONSUL_TRACKINGDB3_IP]
934   }
935
936   consul['monitoring_service_discovery'] =  true
937
938   # GitLab database settings
939   gitlab_rails['db_database'] = 'gitlabhq_geo_production'
940   gitlab_rails['db_username'] = 'gitlab_geo'
941   ```
942
9431. Reconfigure GitLab for the changes to take effect:
944
945   ```shell
946   gitlab-ctl reconfigure
947   ```
948
9491. Create a `.pgpass` file so Consul is able to reload PgBouncer. Enter the `PLAIN_TEXT_PGBOUNCER_PASSWORD` twice when asked:
950
951   ```shell
952   gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul
953   ```
954
9551. Restart the PgBouncer service:
956
957   ```shell
958   gitlab-ctl restart pgbouncer
959   ```
960
961#### Step 2. Configure a Patroni cluster
962
963For each Patroni instance on the secondary site for the tracking database:
964
9651. SSH into your Patroni node and login as root:
966
967   ```shell
968   sudo -i
969   ```
970
9711. Edit `/etc/gitlab/gitlab.rb` and add the following:
972
973   ```ruby
974   # Disable all components except PostgreSQL, Patroni, and Consul
975   roles(['patroni_role'])
976
977   # Consul configuration
978   consul['services'] = %w(postgresql)
979
980   consul['configuration'] = {
981     server: true,
982     retry_join: %w[CONSUL_TRACKINGDB1_IP CONSUL_TRACKINGDB2_IP CONSUL_TRACKINGDB3_IP]
983   }
984
985   # PostgreSQL configuration
986   postgresql['listen_address'] = '0.0.0.0'
987   postgresql['hot_standby'] = 'on'
988   postgresql['wal_level'] = 'replica'
989
990   postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH'
991   postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH'
992   postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
993
994   postgresql['md5_auth_cidr_addresses'] = [
995      'PATRONI_TRACKINGDB1_IP/32', 'PATRONI_TRACKINGDB2_IP/32', 'PATRONI_TRACKINGDB3_IP/32', 'PATRONI_TRACKINGDB_PGBOUNCER/32',
996      # Any other instance that needs access to the database as per documentation
997   ]
998
999   # Add patroni nodes to the allowlist
1000   patroni['allowlist'] = %w[
1001     127.0.0.1/32
1002     PATRONI_TRACKINGDB1_IP/32 PATRONI_TRACKINGDB2_IP/32 PATRONI_TRACKINGDB3_IP/32
1003   ]
1004
1005   # Patroni configuration
1006   patroni['username'] = 'PATRONI_API_USERNAME'
1007   patroni['password'] = 'PATRONI_API_PASSWORD'
1008   patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
1009   patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica
1010
1011   # GitLab database settings
1012   gitlab_rails['db_database'] = 'gitlabhq_geo_production'
1013   gitlab_rails['db_username'] = 'gitlab_geo'
1014   gitlab_rails['enable'] = true
1015
1016   # Disable automatic database migrations
1017   gitlab_rails['auto_migrate'] = false
1018   ```
1019
10201. Reconfigure GitLab for the changes to take effect.
1021   This is required to bootstrap PostgreSQL users and settings:
1022
1023   ```shell
1024   gitlab-ctl reconfigure
1025   ```
1026
1027#### Step 3. Configure the tracking database on the secondary nodes
1028
1029For each node running the `gitlab-rails`, `sidekiq`, and `geo-logcursor` services:
1030
10311. SSH into your node and login as root:
1032
1033   ```shell
1034   sudo -i
1035   ```
1036
10371. Edit `/etc/gitlab/gitlab.rb` and add the following attributes. You may have other attributes set, but the following need to be set.
1038
1039   ```ruby
1040   # Tracking database settings
1041   geo_secondary['db_username'] = 'gitlab_geo'
1042   geo_secondary['db_password'] = 'PLAIN_TEXT_PGBOUNCER_PASSWORD'
1043   geo_secondary['db_database'] = 'gitlabhq_geo_production'
1044   geo_secondary['db_host'] = 'PATRONI_TRACKINGDB_PGBOUNCER_IP'
1045   geo_secondary['db_port'] = 6432
1046   geo_secondary['auto_migrate'] = false
1047
1048   # Disable the tracking database service
1049   geo_postgresql['enable'] = false
1050   ```
1051
10521. Reconfigure GitLab for the changes to take effect.
1053
1054   ```shell
1055   gitlab-ctl reconfigure
1056   ```
1057
10581. Run the tracking database migrations:
1059
1060   ```shell
1061   gitlab-rake geo:db:migrate
1062   ```
1063
1064### Migrating a single tracking database node to Patroni
1065
1066Before the introduction of Patroni, Geo had no Omnibus support for HA setups on
1067the secondary node.
1068
1069With Patroni, it's now possible to support that. Due to some restrictions on the
1070Patroni implementation on Omnibus that do not allow us to manage two different
1071clusters on the same machine, we recommend setting up a new Patroni cluster for
1072the tracking database by following the same instructions above.
1073
1074The secondary nodes backfill the new tracking database, and no data
1075synchronization is required.
1076
1077## Troubleshooting
1078
1079Read the [troubleshooting document](../replication/troubleshooting.md).
1080