• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..10-Nov-2021-

centos/H10-Nov-2021-1,4251,042

debian/H10-Nov-2021-3,0332,291

freebsd/H07-May-2022-2,4302,373

LICENSEH A D10-Nov-202154 21

LICENSE.mysqlH A D10-Nov-202118.6 KiB353295

QUICK_STARTH A D10-Nov-20213.7 KiB9666

READMEH A D10-Nov-202124.4 KiB617455

build.shH A D10-Nov-202126.8 KiB868728

freebsd.shH A D10-Nov-20212.8 KiB7461

get_patch.shH A D10-Nov-20212 KiB7847

my-5.1.cnfH A D10-Nov-2021815 2117

my-5.5.cnfH A D10-Nov-2021154 97

mysql-galeraH A D10-Nov-202117.2 KiB657498

mysql-plainH A D10-Nov-20213.4 KiB156124

rpm.shH A D10-Nov-20216.9 KiB234149

rpm_wc.shH A D10-Nov-20213.5 KiB13386

README

1Codership Oy
2http://www.codership.com
3<info@codership.com>
4
5DISCLAIMER
6
7THIS SOFTWARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
8EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
9OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
10IN NO EVENT SHALL CODERSHIP OY BE HELD LIABLE TO ANY PARTY FOR ANY DAMAGES
11RESULTING DIRECTLY OR INDIRECTLY FROM THE USE OF THIS SOFTWARE.
12
13Trademark Information.
14
15MySQL is a trademark or registered trademark of Sun Microsystems, Inc. or
16its subsidiaries in the US and other countries. Other marks are the property
17of their respective owners.
18
19Licensing Information.
20
21Please see ./mysql/LICENSE.mysql and ./galera/LICENSE.galera
22
23Project source code can be found at
24wsrep API:    https://github.com/codership/wsrep-API/
25MySQL patch:  https://launchpad.net/codership-mysql/
26Galera libs:  https://launchpad.net/galera/
27
28ABOUT THIS DOCUMENT
29
30This document covers issues specific to this MySQL/Galera demo distribution.
31It does not cover the use or administration of MySQL server per se. The reader
32is assumed to know how to install, configure, administer and use MySQL server.
33
34
35                     MYSQL/GALERA v23.x demo package
36
37CONTENTS:
38=========
390. WHAT IS MYSQL/GALERA CLUSTER
401. CLUSTER SETUP
41   1.1 INSTALLATION
42   1.2 CLUSTER URL
43   1.3 STARTING THE FIRST NODE OF A CLUSTER
44   1.4 STARTING REMAINING NODES
452. USING THE CLUSTER
46   2.1 LOADING DATA TO A CLUSTER
47   2.2 CONNECTING APPLICATION TO A CLUSTER
48   2.3 LOAD BALANCER
49   2.4 ADDING NEW NODE TO A CLUSTER
50   2.5 A "REFERENCE NODE"
51   2.6 "SPLIT-BRAIN" CONDITION
523. CONFIGURATION
53   3.1 MANDATORY MYSQL OPTIONS
54   3.2 OPTIONAL OPTIMIZATIONS
55   3.3 WSREP OPTIONS
56   3.4 CONFIGURING LOCAL MYSQL CLIENTS
57   3.5 FIREWALLS
584. Using MySQL/Galera in Amazon EC2
595. LIMITATIONS
60
61
620. WHAT IS MYSQL/GALERA CLUSTER
63
64MySQL/Galera cluster is a synchronous multi-master MySQL server cluster. Cluster
65nodes are connected to each other in a N-to-N fashion through a group
66communication backend which provides automatic reconfiguration in the event of
67a node failure or a new node added to cluster:
68
69     ,--------.    ,--------.    ,--------.
70     | mysqld |----| mysqld |<---| client |
71     `--------'    `--------'    `--------'
72              \    /
73            ,--------.    ,--------.
74            | mysqld |<---| client |
75            `--------'    `--------'
76
77With few exceptions each cluster node can be used like a standalone MySQL server
78and supports most of MySQL features including transactions, triggers and stored
79procedures.
80
81Node states are synchronized by replicating transaction changes at commit time.
82The cluster is virtually synchronous: this means that each node commits
83transactions in exactly the same order, although not necessarily at the same
84physical moment. (The latter is not that important as it may seem, since in most
85cases DBMS gives no guarantee on when the transaction is actually processed.)
86Built-in flow control keeps nodes within fraction of a second from each other,
87this is more than enough for most practical purposes.
88
89Main features of MySQL/Galera cluster:
90
91* Truly highly available: no committed transaction is ever lost in case of a
92  node crash. All nodes always have consistent state.
93
94* True multi-master: all cluster nodes can modify the same table concurrently.
95
96* Highly transparent: the cluster is intended as a drop-in replacement for a
97  single MySQL server. (See LIMITATIONS below)
98
99* Scalable even with WRITE-intensive applications.
100
101This demo distribution contains all software you'll need to setup MySQL/Galera
102cluster. It is essentially a self-contained MySQL server installation with its
103own configuration file, data directory and preconfigured empty database.
104You don't need administrative privileges or to uninstall/disable previously
105installed MySQL server to use this distribution.
106
107
1081. CLUSTER SETUP
109
110To setup MySQL/Galera cluster you will need several networked computers -
111one for each mysqld instance you plan to use. For best performance those
112computers should be of approximately same configuration: Galera replication is
113synchronous and one overloaded machine may slow down the whole cluster. This
114however depends on load distribution. The node that does not handle client
115connections can be considerably slower.
116
117It takes 3 steps to set up the cluster:
118
1191) Copy this distribution to all prospective nodes of the cluster and unpack it
120   to location of your choice.
121
1222) Start the first node to begin a new cluster.
123
1243) Start remaining nodes pointing to the first one.
125
126(NOTE: You can easily set up the cluster on a single computer. However this
127makes little sense, as you won't see the the benefits of high availability and
128scalability. Hence it is not covered by this document.)
129
130
1311.1 INSTALLATION
132
133Just copy and unpack the distribution on the prospective cluster nodes to
134wherever you have privileges. The distribution was designed to be able to run
135on most systems without reconfiguration. It is a self-contained MySQL
136installation and comes with its own data directory and a preconfigured empty
137database with users 'root' (password 'rootpass') and 'test' (password
138'testpass', privileges on schema 'test'). As a result default installation will
139require at least 1Gb of free space for InnoDB files (will be created on first
140start).
141
142This requirement, as well as other MySQL and Galera options can be changed by
143editing configuration file which can be found at <INST_ROOT>/mysql/etc/my.cnf.
144
145Please see CONFIGURATION chapter for the details on editable parameters.
146
147
1481.2 CLUSTER URL
149
150Cluster URL is a connection handle that will be used by a new node to connect
151to the rest of the cluster. Its form is backend-specific and backend is
152determined by URL schema. Default is 'dummy' which means no replication.
153This demo comes with a distributed group communication backend which schema is
154'gcomm'.
155
156
1571.3 STARTING THE FIRST NODE OF A CLUSTER
158
159<INST_ROOT>/mysql-galera is a special MySQL startup script that sets proper
160options (including data directory path) for mysqld. If you're running it as a
161superuser, you have to make sure there is 'mysql' user in the system and it has
162sufficient privileges on the installation directory (see MySQL Manual about
163running mysqld as root).
164
165The first node of a cluster has nowhere to connect to, therefore it has to start
166with an empty cluster address (note that it still initializes gcomm backend):
167
168   <INST_ROOT>/mysql-galera -g gcomm:// start
169
170
1711.4 STARTING REMAINING NODES
172
173To add another node to the cluster it must be given the address of one of the
174existing cluster nodes.
175
176Thus, if the first cluster node has IP address 192.168.1.1, then the second will
177be started like this:
178
179   <INST_ROOT>/mysql-galera -g gcomm://192.168.1.1 start
180
181The third node can use either the first or the second node address and so on.
182
183It might take few minutes to start mysqld for the first time as it will have to
184create required InnoDB files.
185
186For full description of mysql-galera options and commands see:
187
188   <INST_ROOT>/mysql-galera --help
189
190
1912. USING THE CLUSTER
192
193After you have successfully started all cluster nodes, the cluster is ready to
194use.
195
196From the client point of view each cluster node works like a usual MySQL server
197- client-side application does not have to be changed in any way. Each node can
198be accessed independently and asynchronously. Just direct SQL load to any one or
199more of the cluster nodes. For most practical purposes you can treat
200MySQL/Galera cluster as a single MySQL server listening on multiple interfaces
201with the exception that you might see transaction deadlocks where you previously
202didn't.
203
204
2052.1 LOADING DATA TO CLUSTER
206
207Initially distribution database is empty. You can populate it by loading the
208dump of your data to any one of the nodes. It will be automatically replicated
209to others. Please note that this release supports only InnoDB storage engine.
210
211
2122.2 CONNECTING APPLICATION TO CLUSTER
213
214As was mentioned above, for the client application each node looks like a normal
215MySQL server and can be used independently. This creates considerable
216flexibility in the way the cluster can be utilized. The approaches can be
217categorized in three groups:
218
2191) Seeking High Availability only. It is similar to traditional MySQL
220   master-slave replication. In this case client application connects to only
221   one node, the rest serving as hot backups / read-only slaves:
222
223   ,-------------.
224   | application |
225   `-------------'
226        | | |       DB backups/read-only slaves
227      ,-------. ,-------. ,-------.
228      | node1 | | node2 | | node3 |
229      `-------' `-------' `-------'
230       <===== cluster nodes =====>
231
232   In the case of primary node failure application can instantly switch to
233   another node without any preparations. This is also a most transparent mode:
234   COMMITs will never return deadlocks and table locks can be used too.
235
2362) Seeking High Availability and improved performance through uniform load
237   distribution. If there are several client connections to the database, they
238   can be uniformly distributed between cluster nodes resulting in better
239   performance. The exact degree of performance improvement depends on
240   application's SQL profile. Note, that transaction rollback rate may also
241   increase.
242
243             ,-------------.
244             |   clients   |
245             `-------------'
246                 | | | |
247             ,-------------.
248             | application |
249             `-------------'
250             /      |      \
251      ,-------. ,-------. ,-------.
252      | node1 | | node2 | | node3 |
253      `-------' `-------' `-------'
254       <===== cluster nodes =====>
255
256   In the case of a node failure application can keep on using the remaining
257   healthy nodes.
258
259   In this setup application can also be clustered with a dedicated application
260   instance per database node, thus achieving HA not only for the database,
261   but for the whole application stack:
262
263             ,-------------.
264             |   clients   |
265             `-------------'
266             //     ||     \\
267      ,------.   ,------.   ,------.
268      | app1 |   | app2 |   | app3 |
269      `------'   `------'   `------'
270         |          |          |
271     ,-------.  ,-------.  ,-------.
272     | node1 |  | node2 |  | node3 |
273     `-------'  `-------'  `-------'
274      <====== cluster nodes ======>
275
2763) Seeking High Availability and improved performance through smart load
277   distribution. Uniform load distribution can cause undesirably high rollback
278   rate. Directing transactions which access the same set of tables to the
279   same node can considerably improve performance by reducing the number of
280   rollbacks. Also, if your application can distinguish between read/write and
281   read-only transactions, the following configuration may be quite efficient:
282
283             ,---------------------.
284             |     application     |
285             `---------------------'
286       writes /         | reads    \ reads
287      ,-------.     ,-------.     ,-------.
288      | node1 |     | node2 |     | node3 |
289      `-------'     `-------'     `-------'
290       <========= cluster nodes =========>
291
292
2932.3 LOAD BALANCER
294
295If your application cannot utilize several database servers (most don't) you
296will need to use SQL proxy or a TCP load balancer to distribute load between
297the MySQL/Galera nodes. This is needed not only to increase performance, but
298also for a quick switch in case of a node failure. If performance of your
299application is DBMS-bound, you can run the balancer on the same machine as
300application/client. Be aware, however, that SQL load balancing might be a CPU
301hungry operation: usually SQL traffic consists of many small packets. For best
302results we recommend to carefully examine CPU consumption of the balancer and
303if needed dedicate a separate machine for it.
304
305Unlike traditional MySQL master-slave cluster, MySQL/Galera cluster does not
306require any SQL traffic filtering, it is highly transparent and plain TCP
307connection balancer will suffice.
308
309TCP connection balancers that were successfully used with MySQL/Galera:
310- Pen (http://siag.nu/pen/)
311- GLB (http://www.codership.com/en/downloads/glb)
312
313
3142.4 ADDING NEW NODE TO A CLUSTER
315
316With 0.7 series of releases Codership removes the main obstacle towards using
317MySQL/Galera in production: inability to add/replace nodes in the working
318cluster. This distribution features automatic state snapshot transfer to newly
319joined node. Until node receives state snapshot it won't execute any queries.
320Detailed state snapshot transfer sequence diagram can be found in
321http://www.codership.com/files/presentations/Galera_OSC_2009.pdf
322
323The process of joining new node into the cluster consists of two phases:
324
3251) State snapshot transfer (SST) from an established cluster member.
326   Depending on the SST method neither joining node, nor SST donor can apply
327   any transactions for the duration of this phase. Transactions replicated by
328   other nodes are buffered in the queue.
329
3302) Catch-up phase when both donor and joiner try to catch up with the cluster
331   by applying transactions buffered in the queue. Using them as working nodes
332   should be avoided. Duration of this phase depends on the load profile and
333   duration of the first phase.
334
335NOTE: Transaction buffering is currently happening in memory, so prepare
336      enough swap space.
337
338By default cluster chooses the most suitable node to receive state transfer
339from. There is also an option wsrep_sst_donor to specify desired state snapshot
340source in my.cnf or on the command line. See CONFIGURATION section for
341descriptions of all relevant configuration options. In most situations (like
342cluster on EC2) this distribution should work with default settings.
343
344At this point there is only one state transfer method supported and it is based
345on mysqldump. Although it is relatively slow, it provides complete cloning of
346the donor state, including system tables and thus is most compliant.
347
348
3492.5 A "REFERENCE NODE"
350
351For practical purposes we recommend to reserve a "reference node" in the
352cluster. A "reference node" is a node that does not receive SQL load. Having
353such node in a cluster serves several purposes:
354
3551) Data consistency: since this node does not process any SQL load on its own,
356   it has the lowest probability of transaction conflicts and therefore -
357   indeterministic conflict resolution. In the event of discovered database
358   inconsistencies in the cluster this node will have the most relevant
359   database.
360
3612) Data safety: since this node does not process any SQL load on its own, it
362   has the lowest probability of failing with catastrophic consequences. In
363   the event of total cluster failure (e.g. blackout) this will be the best
364   node to restore cluster from.
365
3663) High availability: a reference node can serve as a dedicated state snapshot
367   donor. Since it does not serve any clients, they won't experience service
368   interruptions and load balancer won't need reconfiguration during SST.
369
370Even with the current TCP-based group communication the overhead of having one
371extra silent node is negligible for most loads.
372
373
3742.6 "SPLIT-BRAIN" CONDITION
375
376Galera cluster is fully distributed and does not use any sort of centralized
377arbitrator, thus having no single point of failure. However, like any cluster
378of that kind it may fall to a dreaded "split-brain" condition where half or
379more nodes of the cluster suddenly disappear (e.g. due to network failure).
380In general case, having no information about the fate of disappeared nodes,
381remaining nodes cannot continue to process requests and modify their states.
382
383While such situation is generally considered negligibly probable in a multi-node
384cluster (normally nodes fail one by one), in 2-node cluster a single node
385failure can lead to this, thus making 3 nodes a minimum requirement for
386a highly-available cluster.
387
388Dedicated Galera packages (not this distribution) contain a lightweight
389"arbitrator" daemon which can serve as an odd node substitute in situations
390where cluster size is limited to 2 real nodes.
391
392
3933. CONFIGURATION
394
395Each MySQL/Galera node is configured just like the usual MySQL server, we just
396added some configuration variables to my.cnf. In addition some options can be
397passed to mysql-galera startup script (see mysql-galera --help).
398
399
4003.1 MANDATORY MYSQL OPTIONS
401
402binlog_format=ROW
403   This option is required to use row-level replication as opposed to
404   statement-level. For performance and consistency considerations don't change
405   that. As a side effect, binlog, if turned on, can be ROW only. In future this
406   option won't have special meaning.
407
408innodb_autoinc_lock_mode=2
409   This is a required parameter. Without it INSERTs into tables with
410   AUTO_INCREMENT column may fail.
411   autoinc lock modes 0 and 1 can cause unresolved deadlock, and make
412   system unresponsive.
413
414innodb_locks_unsafe_for_binlog=1
415   This setting is required for relaiable parallel applying operation.
416
417Mandatory options are hardcoded both in distribution's my.cnf file and in
418mysql-galera script.
419
420
4213.2 OPTIONAL OPTIMIZATIONS
422
423While not required for correct operation of MySQL/Galera cluster, the
424following options may be safely set due to the guarantees of synchronous
425replication:
426
427innodb_flush_log_at_trx_commit=0
428
429
4303.3 WSREP OPTIONS
431
432Here WSREP stands for Write-Set REPlication - a synchronous replication API
433that Codership is developing for transactional databases. Galera is a library
434that implements WSREP services. Default values are shown.
435
436All options are optional except for wsrep_provider and wsrep_cluster_address.
437
438wsrep_provider=none
439   A full path to the library that implements WSREP interface. If none is
440   specified, the server behaves almost like a normal mysqld, with slight
441   overhead. mysql-galera script automatically substitutes it to point to
442   Galera implementation shipped with the distribution. It can be overridden
443   with WSREP environment variable.
444
445wsrep_provider_options=
446   Provider-specific option string. See http://www.codership.com/wiki for
447   details.
448
449wsrep_cluster_address="dummy://"
450   Group Communication System address. Depends on the WSREP provider. This
451   distribution recognizes "dummy://" and "gcomm://<address>[:port]"
452   Default port is 4567.
453
454   mysql> set global wsrep_cluster_address=<ADDRESS>;
455
456   will (re)establish connection to ADDRESS. This can be used to change cluster
457   connection in runtime.
458
459wsrep_cluster_name="my_wsrep_cluster"
460   Logical cluster name, must be the same for all nodes of the cluster.
461
462wsrep_node_name=
463   Human readable node name. Defaults to hostname.
464
465wsrep_slave_threads=1
466   Number of threads dedicated to processing of writesets from other nodes.
467   For better performance we recommend few per CPU core.
468
469wsrep_dbug_option
470   Options for the built-in DBUG library (independent from what MySQL uses).
471   Empty by default. Not used in 0.8.
472
473wsrep_debug=0
474   Enable debug-level logging.
475
476wsrep_convert_LOCK_to_trx=0
477   Implicitly convert locking sessions into transactions inside mysqld. By
478   itself it does not mean support for locking sessions, but it prevents the
479   database from going into logically inconsistent state. Disabled by default
480   because of possible memory issues with DB dumps that contain LOCK statements.
481
482wsrep_retry_autocommit=1
483   Retry autocommit queries and single statement transactions should they fail
484   certification test. This is analogous to rescheduling an autocommit query
485   should it go into deadlock with other transactions in the database lock
486   manager.
487
488wsrep_auto_increment_control=1
489   Automatically adjust auto_increment_increment and auto_increment_offset
490   variables based on the number of nodes in the cluster. Significantly reduces
491   certification conflic rate for INSERTS.
492
493wsrep_drupal_282555_workaround=1
494   MySQL seems to have an obscure bug when INSERT into table with
495   AUTO_INCREMENT column with NULL value for that column can fail with a
496   duplicate key error. When this option is on, it retries such INSERTs.
497   Required for stable Drupal operation. Documented at:
498      http://bugs.mysql.com/bug.php?id=41984
499      http://drupal.org/node/282555
500
501wsrep_sst_method=mysqldump
502   What method to use to copy database state to a newly joined node. Currently
503   supported methods:
504   - mysqldump: generally slow (except on small datasets), but most tested
505   - rsync:     the fastest method, especially on large datasets
506   - rsync_wan: same as rsync, but uses deltaxfer to minimize network traffic.
507
508wsrep_sst_receive_address=
509   Address at which this node wants to receive state snapshot. Defaults to
510   mysqld bind address, and if that is not specified (0.0.0.0) - to the first
511   IP of eth0 + mysqld bind port.
512
513wsrep_sst_auth=
514   Authentication information needed for state transfer. Depends on the state
515   transfer method. For mysqldump-based SST it is
516   <mysql_root_user>:<mysql_root_password>
517   and should be the same on all nodes - it is used to authenticate with both
518   state snapshot receiver and state snapshot donor. In this distribution it is
519   preconfigured to "root:rootpass".
520
521wsrep_sst_donor=
522   A name of the node which should serve as state snapshot donor. This allows
523   to control which node will serve state snapshot request. By default the
524   most suitable node is chosen by GCS.
525
526
5273.4 CONFIGURING LOCAL MYSQL CLIENTS
528
529This MySQL/Galera distribution runs mysqld in the "sandbox". Thus mysql clients
530won't find mysqld socket at default system location. Running mysql client
531without explicitly specifying socket or port (via --socket or --host/--port
532options) may, therefore, result in the following:
533
534$ mysql -uroot -prootpass
535ERROR 2002 (HY000): Can't connect to local MySQL server through socket
536'/var/run/mysqld/mysqld.sock' (2)
537
538Most applications that use libmysqlclient to connect to MySQL server can be
539instructed to look in the correct place by adding a following section to
540system-wide my.cnf file:
541
542[client]
543socket = <INST_ROOT>/mysql/var/mysqld.sock
544
5453.5 FIREWALLS
546
547If there are any firewalls used, they should be configured to allow connections
548between the nodes at the following ports:
5493306 - for mysqldump state snapshot transfer
5504567 - for replication traffic
551
552E.g. to configure iptables to allow connections from a local subnet:
553
554iptables -A INPUT -i eth0 -p tcp -m tcp \
555         --source 192.168.0.1/24 --dport 3306 -j ACCEPT
556iptables -A INPUT -i eth0 -p tcp -m tcp \
557         --source 192.168.0.1/24 --dport 4567 -j ACCEPT
558
559Substitute real values for IP address of your node and netmask. Better yet,
560use VPN.
561
5624. Using MySQL/Galera distribution in Amazon EC2
563
564MySQL/Galera works anywhere TCP/IP works. Therefore using MySQL/Galera
565distribution in Amazon EC2 environment is no different than in LAN. Just launch
566several instances of your favorite AMI, copy and unpack the distribution,
567and start the servers. Don't forget to use external addresses if your nodes are
568running in different accessibility zones (obviously running in different
569accessibility zones degrades performance somewhat).
570
571NOTE: this distribution may be binary incompatible with some older Linux
572      distributions. Please use CentOS 5.0 or newer.
573
574
5755. LIMITATIONS
576
5771) Currently replication works only with InnoDB storage engine. Any writes to
578   tables of other types, including system (mysql.*) tables are not replicated.
579   However, DDL statements are replicated in statement level, and changes
580   to mysql.* tables will get replicated that way.
581   So, you can safely issue: CREATE USER...,
582   but issuing: INSERT INTO mysql.user..., will not be replicated.
583
5842) DELETE operation is not supported on tables without primary keys.
585   Rows in tables without primary keys may appear in different order on
586   different nodes. As a result SELECT...LIMIT... may return slightly different
587   sets.
588
5893)  Unsupported queries:
590    * LOAD DATA size is limited to ~1Gb
591    * lock functions (GET_LOCK(), RELEASE_LOCK()... )
592
5934) Locking sessions (LOCK TABLES...UNLOCK) are not supported in multi-master
594   mode. However if there's only one node that executes locking sessions, then
595   it'll work.
596
5975) Transaction isolation level should be REPEATABLE_READ (the default).
598   Galera implements implicit snapshot isolation for cluster wide
599   transactions.
600
6016) Due to cluster level optimistic concurrency control, transaction issuing
602   COMMIT may still be aborted at that stage. There can be two transactions
603   writing to same rows and committing in separate cluster nodes, and only one
604   of the them can successfully commit. The failing one, will be aborted.
605   For cluster level aborts, MySQL/galera cluster gives back deadlock error
606   code (Error: 1213 SQLSTATE: 40001  (ER_LOCK_DEADLOCK)).
607
6087) Query log cannot be directed to table. If you enable query logging,
609   you must forward the log to a file:
610       log_output = FILE
611   Use general_log and general_log_file to choose query logging and the
612   log file name
613
6148) XA transactions can not be supported due to possible rollback on commit.
615
616
617