
1st September 2010, 10:41
|
|
Senior Member
|
|
Join Date: Sep 2009
Posts: 292
Thanks: 1
Thanked 4 Times in 3 Posts
|
|
I've lose the mysql master-master replication of ISPConfig
Hi all,
I have 2 mailservers based on ISPConfig 3.0.1.6.
Basically they are replicated servers, with the master-master replication of the mysql and the sync of /var/vmail realized with glusterfs. It was working perfectly for almost one year. If one of the server was down for every reason, the replication was working just fine.
Few days ago for an hardware fault, one of the server went down(SRV1). Now it's up again, but I've lose the replication of the mysql.
I've realized the mysql replication following this how-to: http://www.howtoforge.com/mysql-5-ma...ation-fedora-8
Checking the slave status of mysql in both servers, I've this situation:
Code:
SRV1: Slave_IO_Running Slave_SQL_Running
NO NO
SRV2: Slave_IO_Running Slave_SQL_Running
NO YES
What is going on? Any suggestion?
Thanks
Michele
|

1st September 2010, 10:53
|
|
Senior Member
|
|
Join Date: Sep 2008
Location: The Netherlands
Posts: 911
Thanks: 12
Thanked 95 Times in 92 Posts
|
|
Is there anything in the Error field of the slave that stopped working?
|

1st September 2010, 11:20
|
|
Senior Member
|
|
Join Date: Sep 2009
Posts: 292
Thanks: 1
Thanked 4 Times in 3 Posts
|
|
Where I can check it?
That is the result of the show master slave:
Code:
| Slave_IO_State | Master_Host | Master_User | Master_Port | Connect_Retry | Master_Log_File | Read_Master_Log_Pos | Relay_Log_File | Relay_Log_Pos | Relay_Master_Log_File | Slave_IO_Running | Slave_SQL_Running | Replicate_Do_DB | Replicate_Ignore_DB | Replicate_Do_Table | Replicate_Ignore_Table | Replicate_Wild_Do_Table | Replicate_Wild_Ignore_Table | Last_Errno | Last_Error | Skip_Counter | Exec_Master_Log_Pos | Relay_Log_Space | Until_Condition | Until_Log_File | Until_Log_Pos | Master_SSL_Allowed | Master_SSL_CA_File | Master_SSL_CA_Path | Master_SSL_Cert | Master_SSL_Cipher | Master_SSL_Key | Seconds_Behind_Master |
| | xxx.xxx.xxx.xxx | slave2_user | 3306 | 60 | mysql-bin.000359 | 17135291 | slave-relay.001352 | 17077099 | mysql-bin.000359 | No | No | | | dbispconfig.mail_user,dbispconfig.cron,dbispconfig.spamfilter_users,dbispconfig.mail_domain,dbispconfig.test,dbispconfig.mail_content_filter,dbispconfig.mail_transport,dbispconfig.client_template,dbispconfig.mail_forwarding,dbispconfig.firewall,dbispconfig.spamfilter_wblist,dbispconfig.client,dbispconfig.spamfilter_policy,dbispconfig.mail_user_filter,dbispconfig.dns_rr,dbispconfig.mail_access,dbispconfig.dns_soa,dbispconfig.mail_traffic,dbispconfig.dns_template,dbispconfig.mail_mailman_domain,dbispconfig.mail_get,dbispconfig.mail_greylist | | | | 0 | | 0 | 17135291 | 0 | None | | 0 | No | | | | | | NULL |
I had a look into the log as well, but I can not find nothing...
Thanks
Michele
|

1st September 2010, 12:26
|
|
Senior Member
|
|
Join Date: Sep 2008
Location: The Netherlands
Posts: 911
Thanks: 12
Thanked 95 Times in 92 Posts
|
|
just run: START SLAVE on the slave that is not running then (i couldn't see any errors)
|

1st September 2010, 13:43
|
|
Senior Member
|
|
Join Date: Sep 2009
Posts: 292
Thanks: 1
Thanked 4 Times in 3 Posts
|
|
Hi Mark.
I've tried to run the slave and that is the result:
Code:
mysql> START SLAVE;
ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log
I'm trying to understand which log I need to check, but I'm a bit lost.
In /etc/mysql/my.cnf, I can see this line:
Code:
log-bin = /var/log/mysql/mysql-bin.log
but in /var/log/mysql/ I've just this files:
Code:
mysql-bin.000531 mysql-bin.000533 mysql-bin.000535 mysql-bin.000537 mysql-bin.000539 mysql-bin.000541 mysql-bin.000543 mysql-bin.index
mysql-bin.000532 mysql-bin.000534 mysql-bin.000536 mysql-bin.000538 mysql-bin.000540 mysql-bin.000542 mysql-bin.000544
Suggestions?
Thanks
Michele
PS: I was thinking about a solution like the one suggested in this website: http://blog.bit-matrix.com/2008/11/1...nfo-structure/
Do you know if I can do it even if the 2 databases are not the same anymore? (because in this days I've changed some values in one of them)
Last edited by voltron81; 1st September 2010 at 14:01.
|

1st September 2010, 16:23
|
|
Senior Member
|
|
Join Date: Sep 2009
Posts: 292
Thanks: 1
Thanked 4 Times in 3 Posts
|
|
Ok I've a news.
I was following this website http://blogama.org/node/49 and I was able to run the slave.
Anyway I've got exatly the same error that the how-to said abot the LOAD DATA FROM MASTER;
He explain how to do solve it, but I don't know the commands...
Now in both servers if I run the show slave status I'll have:
Code:
SRV1: Slave_IO_Running Slave_SQL_Running
NO YES
SRV2: Slave_IO_Running Slave_SQL_Running
NO YES
But still no replication...
|

2nd September 2010, 09:04
|
|
Senior Member
|
|
Join Date: Sep 2008
Location: The Netherlands
Posts: 911
Thanks: 12
Thanked 95 Times in 92 Posts
|
|
Ehm, nice to see you found a site with a possible solution, but that just doesn't seem right ..
replication takes place by sending all the queries that are entered on the master to the slave and execute them there as well. All those queries are saved in the binlog files and the slave reads that binlog and saves it on his own machine to execute it.
if a binlog reaches a certain size, it'll close the file and start a new one .. depending on your configuration it will start deleting binlogs when it (let's say) 4th file ..
let's say your replication runs fine and your master is writing his incoming queries in file mysql-bin.000001 .. happy writing etc, file full starts writing in mysql-bin.000002 etc etc .. until it reaches mysql-bin.000005 he will delete mysql-bin.000001
so you have:
mysql-bin.000002
mysql-bin.000003
mysql-bin.000004
mysql-bin.000005
if you stopped your replication (or replication crashed) when he was still writing halfway into mysql-bin.000001, and you start it again when it's writing in mysql-bin.000005 you'll never be able to create a consistent replicated server, because you're missing half the queries in mysql-bin.000001, that file is not on the server anymore ..
oke, long story about replication ;-)
short story: You need to stop all slaves to create a complete dump of the server on which the slave is running, dump it on the master and correct the master.info on the broken slave, then you'll be abdle to start the replication both ways again.
good luck!
edit: so check the master.info on the non working slave, look at the binlog it was last reading when it stopped working (line 2 in the file) .. if that file does NOT exist on the working slave, then you need to create a new dump.
Last edited by Mark_NL; 2nd September 2010 at 09:08.
|
| Thread Tools |
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT +2. The time now is 07:44.
|
Recent comments
1 day 22 hours ago
2 days 7 hours ago
2 days 10 hours ago
2 days 11 hours ago
2 days 13 hours ago
2 days 14 hours ago
2 days 16 hours ago
2 days 17 hours ago
3 days 9 hours ago
3 days 10 hours ago