How To Skip Certain Errors In MySQL Replication

MySQL replication is nice, however it can happen that it stops because of an error, and restoring a working replication can be hard - you need to set locks on the master to get a consistent MySQL dump, and during that time websites are not accessible. However there's a way to make the MySQL slave ignore certain errors using the slave-skip-errors directive.

 

1 Preliminary Note

Using slave-skip-errors you can make the replication slave ignore certain error codes (you can find the list of MySQL error codes here: Server Error Codes and Messages). However, you should use this with caution - it should be your last instrument trying to restore replication because it can make your slave inconsistent. You should first try one of these: How To Repair MySQL Replication or use the replicate-ignore-db or replicate-ignore-table statements in the slave's my.cnf file to skip replication for databases/database tables that cause replication to fail (if you don't need replication for these databases/database tables).

 

2 Using slave-skip-errors

Let's assume replication has stopped, and you are getting something like this on the slave's MySQL shell:

SHOW SLAVE STATUS \G
mysql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 1.2.3.4
                  Master_User: slave_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000024
          Read_Master_Log_Pos: 933201702
               Relay_Log_File: mysqld-relay-bin.000113
                Relay_Log_Pos: 63519994
        Relay_Master_Log_File: mysql-bin.000021
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB: mysql,information_schema,performance_schema,test
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1054
                   Last_Error: Error 'Unknown column 'tx_feuserbranch_agb' in 'field list'' on query. Default database: 'dbtest'. 
Query: 'INSERT INTO fe_users (email,first_name,last_name,address,ip,city,country,telephone,fax,company,tx_feuserbranch_brancha,
tx_feuserbranch_customernr,tx_feuserbranch_agb,username,password,name,usergroup,disable,by_invitation,tx_srfeuserregister_password,
tstap,crdate,cruser_id,fe_cruser_id,pid) VALUES ('[email protected]','John','Doe','test','55555','test','Deutschland','49111111111',
'','test','0','55','on','[email protected]','xxxxxxxxxxxxxx','John Doe','0','1','0','','1361359747','1361359747','0','0','33')'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 120460827
              Relay_Log_Space: 3500841420
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: /etc/mysql/newcerts/ca-cert.pem
           Master_SSL_CA_Path:
              Master_SSL_Cert: /etc/mysql/newcerts/client-cert.pem
            Master_SSL_Cipher:
               Master_SSL_Key: /etc/mysql/newcerts/client-key.pem
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1054
               Last_SQL_Error: Last_Error: Error 'Unknown column 'tx_feuserbranch_agb' in 'field list'' on query. Default database: 'dbtest'. 
Query: 'INSERT INTO fe_users (email,first_name,last_name,address,ip,city,country,telephone,fax,company,tx_feuserbranch_brancha,
tx_feuserbranch_customernr,tx_feuserbranch_agb,username,password,name,usergroup,disable,by_invitation,tx_srfeuserregister_password,
tstap,crdate,cruser_id,fe_cruser_id,pid) VALUES ('[email protected]','John','Doe','test','55555','test','Deutschland','49111111111',
'','test','0','55','on','[email protected]','xxxxxxxxxxxxxx','John Doe','0','1','0','','1361359747','1361359747','0','0','33')'
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

mysql>

As you can see in the Last_Errno row the error code of the error that caused replication to stop is 1054 (Unknown column '%s' in '%s'). If you want the MySQL slave to ignore such errors, just open your my.cnf (on Debian and Ubuntu it's /etc/mysql/my.cnf)...

vi /etc/mysql/my.cnf

... and add the line slave-skip-errors = 1054:

[...]
expire_logs_days        = 10
max_binlog_size         = 100M
server-id=2
replicate-ignore-db = mysql
replicate-ignore-db = information_schema
replicate-ignore-db = performance_schema
replicate-ignore-db = test
slave-skip-errors = 1054
[...]

Restart MySQL afterwards:

/etc/init.d/mysql restart

Log into MySQL again...

mysql -u root -p

... and check the slave's status again - it should now be working again:

SHOW SLAVE STATUS \G

To make the slave ignore multiple error codes, just separate them by commas:

slave-skip-errors = 1062,1054

You can find the list of MySQL error codes here: Server Error Codes and Messages

 

Share this page:

5 Comment(s)

Add comment

Please register in our forum first to comment.

Comments

By: Giuseppe Maxia

Don 't do that!

Skipping errors blindly is the worst thing to do.

 A replication error has usually a reason, and finding that reason makes your system safer. If you skip errors, you will never get to know that something went wrong

By:

I agree, skipping errors like this is terrbily dangerous!

Rather than that, if you managed to solve your issue and you want to skip one or more sql statement restart this way:

SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; 

By: K. Pardiac

Thanks for the advice.  This helped solved our problem in replication.  We had a corrupt table in production that generated errors on INSERT DELAYED statements.  On the replication server, the statements executed fine but replication stopped because the error code on replication did not match the one in production.  There were several hundred -- at least -- of these so incrementing the skip counter when each stoppage happened would have been time consuming.  We used SLAVE_SKIP_ERRORS=0 to keep replication from stopping on these INSERTs and could then use the replicated table (which had all to rows) to replace the production version.

By: robert

Thanks saved me a ton of work

By: Todd

Great Work, I removed the entry:

Last_SQL_Error: Error 'Duplicate entry '18' for key 'PRIMARY'' on query. Default database: 'vtiger'. Query: 'INSERT INTO vtiger_loginhistory (user_name, user_ip, logout_time, login_time, status) VALUES ('admin','107.77.68.98','0000-00-00 00:00:00','2020-06-21 17:02:10','Signed in')'

use vtiger;

delete from vtiger_loginhistory where user_name = 'admin' and login_time = '2020-06-21 17:02:10';

stop slave; start slave;

But the error continued to show up even after I stopped the slave and database (Empty Entry). I even added the single global statement stop slave; set global sql_slave_skip_counter=1; start slave;

So after adding the erroneous error to my my.cnf file in my docker mariadb DB, everything started working fine.

Thank you for all of your insight.

Thanks to you I have replicated across multiple Mariadb across multiple cloud providers (AWS, Oracle, Azure, Digital Ocean and on-premise).

Todd