Last_IO_Errno: 1595 “Relay log write failure: could not queue event from master”

The disk filled up on a mysql master database because of a few 80Gb queries being written to the mysql bin logs (somehow it couldn’t handle them). The fix on the master was to have more disk space. The fix on the slave was not the usual:

stop slave; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; start slave

That basically made the slave spend a lot of time on the transaction, and then fail again. The failure is not in the incoming SQL stream, but in the execution stream, and the execution stream is feeling stupid. It says it like this:

show slave status\G
                   Master_Host: 192.168.0.41
                   Master_User: mastersvr10
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mastersvr4-bin.044956
           Read_Master_Log_Pos: 2439692635
                Relay_Log_File: mastersvr4-relay-bin.003988
                 Relay_Log_Pos: 374519436
         Relay_Master_Log_File: mastersvr4-bin.044954
              Slave_IO_Running: No
             Slave_SQL_Running: Yes
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 374519132
               Relay_Log_Space: 53133251140
         Seconds_Behind_Master: NULL
                 Last_IO_Errno: 1595
                 Last_IO_Error: Relay log write failure: could not queue event from master
                Last_SQL_Errno: 0
                Last_SQL_Error:
       Slave_SQL_Running_State: Update_rows_log_event::ha_update_row(-1)

Stopping and starting the slave basically causes the Slave_SQL_Running_State to toggle between Update_rows_log_event::ha_update_row(-1) and Update_rows_log_event::find_row(-1), and then finally stop trying.

The fix is to reconnect to the master at the same place we left off. The relevant parts of the status are the Master_Log_File, and Exec_Master_Log_Pos, which is how far we got in it:

show slave status ...
               Master_Log_File: mastersvr4-bin.044956
           Read_Master_Log_Pos: 2439692635
                Relay_Log_File: mastersvr4-relay-bin.003988
                 Relay_Log_Pos: 374519436
           Exec_Master_Log_Pos: 374519132

And the fix is:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mastersvr4-bin.044956', MASTER_LOG_POS=374519132;
START SLAVE;

And just like that, it’s happy again.

Acknowledgements to https://shahalpk.name/mysql-slave-log-corrupted-how-to-fix/ (who reconnected the password and everything, which is unnecessary in the normal case).

This entry was posted in Stuff and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *