Setting Up Master-Master Replication On Four Nodes With MySQL 5 On Debian Etch

Version 1.0
Author: Falko Timme
Last edited 11/14/2008

This tutorial explains how you can set up MySQL master-master replication on four MySQL nodes (running on Debian Etch). The difference to a two node master-master replication (which is explained here) is that if you have more than two nodes, the replication goes in a circle, i.e., with four nodes, the replication goes from node1 to node2, from node2 to node3, from node3 to node4, and from node4 to node1.

Since version 5, MySQL comes with built-in support for master-master replication, solving the problem that can happen with self-generated keys. In former MySQL versions, the problem with master-master replication was that conflicts arose immediately if node A and node B both inserted an auto-incrementing key on the same table. The advantages of master-master replication over the traditional master-slave replication are that you don't have to modify your applications to make write accesses only to the master, and that it is easier to provide high-availability because if the master fails, you still have the other master.

I do not issue any guarantee that this will work for you!

 

1 Preliminary Note

In this tutorial I will show how to replicate the database exampledb on four MySQL nodes:

  • server1.example.com: IP address 192.168.0.100
  • server2.example.com: IP address 192.168.0.101
  • server3.example.com: IP address 192.168.0.102
  • server4.example.com: IP address 192.168.0.103

Each node is a master and a slave at the same time. All four systems are running Debian Etch; however, the configuration should apply to almost all distributions with little or no modifications.

Replication will work in a circle, i.e., the replication goes from server1 to server2, from server2 to server3, from server3 to server4, and from server4 back to server1:

... --> server1 --> server2 --> server3 --> server4 --> server1 --> ...

 

2 Installing MySQL 5.0

If MySQL 5.0 isn't already installed on server1 to server4, install it now:

server1/server2/server3/server4:

apt-get install mysql-server-5.0 mysql-client-5.0

To make sure that the replication can work, we must make MySQL listen on all interfaces, therefore we comment out the line bind-address = 127.0.0.1 in /etc/mysql/my.cnf:

server1/server2/server3/server4:

vi /etc/mysql/my.cnf

[...]
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address           = 127.0.0.1
[...]

Restart MySQL afterwards:

server1/server2/server3/server4:

/etc/init.d/mysql restart

Then check with

server1/server2/server3/server4:

netstat -tap | grep mysql

that MySQL is really listening on all interfaces:

server1:~# netstat -tap | grep mysql
tcp        0      0 *:mysql                 *:*                     LISTEN     2671/mysqld
server1:~#

Afterwards, set a MySQL password for the user root@localhost:

server1/server2/server3/server4:

mysqladmin -u root password yourrootsqlpassword

Next we create MySQL passwords for root@server1.example.com, root@server2.example.com, root@server3.example.com, and root@server4.example.com:

server1:

mysqladmin -h server1.example.com -u root password yourrootsqlpassword

server2:

mysqladmin -h server2.example.com -u root password yourrootsqlpassword

server3:

mysqladmin -h server3.example.com -u root password yourrootsqlpassword

server4:

mysqladmin -h server4.example.com -u root password yourrootsqlpassword

Now we set up a replication user slaveuser_for_s2 that can be used by server2 to access the MySQL database on server1:

server1:

mysql -u root -p

On the MySQL shell, run the following commands:

GRANT REPLICATION SLAVE ON *.* TO 'slaveuser_for_s2'@'%' IDENTIFIED BY 'slave_user_for_server2_password';
FLUSH PRIVILEGES;
quit;

Then we set up a replication user slaveuser_for_s3 that can be used by server3 to access the MySQL database on server2...

server2:

mysql -u root -p

GRANT REPLICATION SLAVE ON *.* TO 'slaveuser_for_s3'@'%' IDENTIFIED BY 'slave_user_for_server3_password';
FLUSH PRIVILEGES;
quit;

... and a replication user slaveuser_for_s4 that can be used by server4 to access the MySQL database on server3...

server3:

mysql -u root -p

GRANT REPLICATION SLAVE ON *.* TO 'slaveuser_for_s4'@'%' IDENTIFIED BY 'slave_user_for_server4_password';
FLUSH PRIVILEGES;
quit;

... and finally a replication user slaveuser_for_s1 that can be used by server1 to access the MySQL database on server4:

server4:

mysql -u root -p

GRANT REPLICATION SLAVE ON *.* TO 'slaveuser_for_s1'@'%' IDENTIFIED BY 'slave_user_for_server1_password';
FLUSH PRIVILEGES;
quit;

 

3 Some Notes

In the following I will assume that the database exampledb is already existing on server1, and that there are tables with records in it.

Before we start setting up the replication, we create an empty database exampledb on server2, server3, and server4:

server2/server3/server4:

mysql -u root -p

CREATE DATABASE exampledb;
quit;

Share this page:

12 Comment(s)

Add comment

Comments

From: koby at: 2009-04-20 10:06:13

Great feedback.

What do you suggest?

I need a mechanism to sync between tables, I have a set of server, the user is moved between the server therefore, I need to keep identical copies of as many as 10 tables (out of 150~) in all databases, what else can I use (other then inventing the wheel) 

 

Thanks.

From: Sheeri K. Cabral at: 2008-11-19 10:46:12

I am in full agreement with Keith, that master/master replication and circular replication spells trouble.  Which is the "authoritative" copy?

 Also, if you have a ring where replication is as follows:

 srv1 -> srv2 -> srv3 -> srv4 -> srv1

(server2 is a slave of server1, etc)

Then if one link in the chain breaks, no new data goes down the line.  For example, if srv3 is down, then changes happen like this:

 srv3->srv4->srv1->srv2 (OK, because srv3 already has the data)

 srv4->srv1->srv2 (Bad, srv3 doesn't get the change from srv4)

srv1->srv2 (Bad, srv3 and srv4 don't get the change from srv1, even though replication on srv4 is working just fine.

 srv2 (bad, no other machines get changes from srv2, even though replication is only broken on one machine).

From: Joe Dunnigan at: 2009-01-02 21:58:38

The following article works on the issue of replication chain being broken if a node goes down:

http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html?page=5

To answer your first question, in this and any of the MySQL master-master tutorials on howtoforge, the first server is the authoritative server. None of the other servers will ever actually be contacted by the application, either to write or read data, unless the primary master goes down. There is little discussion so far on load balancing this configuration.

From: Keith Murphy at: 2008-11-18 21:26:29

Falko,

 I have seen your tutorials before. You always do a good job. However, I would say that setting up a ring of servers in this manner and writing to them all is a VERY bad idea. Replication simply isn't robust enough for it.Simple MM replication is bad enough and it isn't a good idea to be writing to both servers in that setup.Adding more servers just adds more points for brokeness.

 I have worked with customers who have this type of setup. Everythings great until something goes wrong and then you don't know what was written, where it was written or if it should have been written.

 If you want multiple servers to read off of then a simple master with multi slave setup will suffice.

 I repeat...it is a VERY BAD IDEA to write to multiple servers and expect them to stay in sync!!

 thanks,

Keith Murphy

 

From: at: 2008-11-21 14:13:51

The way it is done is indeed a bad idea but Im sure its possible with a simple script for a slave server to detect a failure of its master then start replicating from another master.

Also im sure it would be possible to do something like a "raid 5" replication circle. From what I read on linux HA, heartbeat v2 supports multiple nodes so there could be a failover node that would take over if any node went down... Basically, that failover node would sync with others but would not be a master until another a live node would go down. Then it would takeover the IP of the failed node...

@ Keith : Simple master with slaves is not a super good idea either, the single master is also a point of failure. 2 way master-master with slaves would be a better idea.

From: robw at: 2008-11-27 01:51:26

"I'm sure it's possible" is a world of difference from "it works solidly, lots of people do this every day".  Please listen to Sheeri and Keith, they know the truth, they bear the scars.

The situation has certainly improved in MySQL 5 with the auto_increment_* settings being a cool hack to get it working at all, but it's still too fragile unless in a tightly controlled environment, where the DBA can sit next to the application developer and keep them on the right path, help them to understand the limitations.

As a DBA, and that's who this how-to was written for, can you guarantee, for example, that the application will never try to write the same unique key to two different nodes?  How does the circle deal with a node failure?  How does the application deal with a node failure?

The answers in practice aren't as simple as one might think.  They're all quite nasty, actually.

We ultimately have to trust that the coders understand the design weaknesses and have a strict enough coding practice to code around them every time.  Much easier said than done, especially when even DBAs struggle to understand the problem.

 

-- Rob

From: Andi at: 2008-11-20 18:24:12

WHY do you repeat all the steps for every server instead of giving 1 abstract howto for every server?

It would be better, if you gave an overview about the abstract concepts behind MySQL server.

Although, thank you for the nice howto.

From: at: 2009-04-17 21:36:45

"...there is absolutely nothing to gain from writing to more than one of the servers at a time." Actually I think there is (and I have been using it for more than a year now,) if your application is designed for HA then you probably use some kind of MySQL proxy system or other app fail-over mechanism. Then you will be able to write to any cluster node. For example: If a node of the circular multi-master cluster fails then your app will continue to run -and write data! Human intervention -best in my opinion- or very complex automated stored procedure methods exist to recover -or at least reconnect a smaller circle of the full cluster. Meanwhile you app is still running on a subset of the cluster :). Caveat: You probably should use other multiple backup systems of your MySQL data. And please, your apps should be able to handle failure correctly. This also means that your app should not rely on the MySQL cluster for mission critical roles, or -shudder- life or property critical roles!

From: Anonymous at: 2008-11-19 00:51:18

It is best not to tell people how to do this or give them ideas about it.  The auto-increment settings solve nothing.  They just make it possible for you to fail to notice how badly messed up your data is getting.  In addition, there is absolutely nothing to gain from writing to more than one of the servers at a time.  And there are many edge cases when things can go horribly wrong.

I advise readers to get a copy of High Performance MySQL, Second Edition and read the replication chapter twice.

From: Anonymous at: 2008-11-25 12:07:00

I'm sorry but mysql replication is quite good these days.  I have been using master <-> master replication for quite some time without issue.

From: Andy Haveland-Robinson at: 2010-08-12 18:17:55

I've been using this kind of replication for a few years, experienced and recovered from a lot of these disasters. Here is some advice based on my experience:

  1. Never ignore replication errors, unless you are really sure. They can be skipped on the slave with this command:
    mysql -u root -ppassword -BEse 'set global sql_slave_skip_counter=1;start slave sql_thread'

  2. Another useful command:
    watch -n10 'mysql -u root -ppassword -BEse "show slave status"'

  3. I tried mysql-proxy, but if there are a lot of users writing data, there are risks that it selects a server that hasn't yet reached a required update, and replication could fail with foreign key errors, Not Good! I never could get mysql-proxy's rw-splitting script to work properly using utf8, and it didn't keep track of current databases properly. After years of being in development I still found it unusable for even amateur use.
    It seems to be a holy grail, to be able to abstract the database layer from the application, but I think it is just a dream. The proxy has to know more than just about connections, and should know more than just about read or write.
    It should make decisions based on sessions and users and assign a server to a user, and stick to it for the session, or until a failover, in which case it should be reassigned to another server after a reasonable delay to allow propagation.
    Mysql-proxy can't tie a user to a server because it doesn't know about users, only connections and some sql parsing, so I found it better to move this decision making to the web application layer, and write my own affinity/failover load balancer to select a free server from a rack of servers over ssh tunnels. In this way, users don't randomly get logged out or see inconsistent pages and think that they have to update again what they thought they just updated!
    If affinity is user-orientated, then on a social network site for example, other users' activities become invisible, but they can each continue to use the server they have been assigned without any problems. When replication resumes hopefully within a few seconds, everyone will receive the pending updates.

  4. Triggers can be difficult, so thoroughly debug on a test platform before going live and hit it hard with something like siege.

  5. If you use a cluster, always add a spare slave to take point-in-time backups. If any of the servers databases acquire an inconsistency, it is likely that this slave will be the most complete.

  6. Restoring a backup and resynchronising can be inconvenient, especially if there are many gigabytes of data, and the simplest approach is to go offline for maintenance, and reload from latest backup. Allowing reloading to replicate is the simplest way, but wasteful with a big database.
    In this case I would recommend failing over to a good master or masters (after reconfiguring the ring), stopping the slaves, backing up a good slave, gzip, send to each server, ungzip and use SET SQL_LOG_BIN=0; before reloading. Once loaded, all the slaves will be at the same state as the nearest slave to the working master(s). This can also be scripted. They can be reset to reclaim log space and re-added into the cluster. Paying attention to the state of SQL_LOG_BIN is paramount when doing this!!!

  7. It is possible to use successfully. Facebook proves this with thousands of mysql servers.
    I run a webserver using circular replication - two of the db servers are at home connected over permanent SSH tunnels to save on colocation costs as the bandwidth requirements are minimal. I wrote my own tools to reshape a cluster on-the-fly, and can add and remove servers within seconds. Theoretically, the php application could reshape the cluster on demand, but at present it just selects another working database and updates a file of port addresses in /dev/shm to notify other php sessions that a server became offline.
    Next enhancement would be to use memcached instead, but I'm still working on proof of concept.

Managing a cluster is a huge responsibility — one slip it could be Game Over!

Andy.

From: at: 2009-01-08 23:00:55

Should a server go down, how does the replication continue? As this is going round in a circle, each server getting updates from the previous, there will be a break in the sequence, replication will not continue. How do you overcome this?