Monday, 17 December 2012

Master crash, Replication broken, Master resumed, Replication still broken!!!



In search of MySQL replication topic for my blogs, Today I got struck with an unusual replication behavior. Well that gives me a topic (Never mind the issue has been fixed, as MySQL says.)
So the scenario is we have simple Master-->Slave replication setup for one of our client. Master crashes, that stops the replication, next the master is brought back up and running, now when we resume the replication, the IO thread slaps us with an error (Strange error, when nothing should have gone wrong). 

121216 22:12:46 [ERROR] Slave I/O thread: error reconnecting to master 'abc@xxx.xxx.xxx.xxx:3306': Error: 'Lost connection to MySQL server at 'reading initial communication packet', system error: 145'  errno: 2013  retry-time: 60  retries: 86400
121216 22:38:57 [Note] Slave: connected to master 'bfimapp@xx.xxx.xxx.xxx:3306',replication resumed in log 'master1-bin.001073' at position 1042308987
121216 22:38:57 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236)
121216 22:38:57 [ERROR] Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log

So thats the error. It suggests that replication is trying to start from an impossible position. So we digged in the concerned binlogs just to find the last position (end_log_pos for the last event) was 1042308684. That doesnt seem correct. Since the master crashed it could not add a rotate event to the file. But thats just another case.

The question is what made the master.info to write such a position, which didnt exist in binlog files on master. 

In search of answers, lets follow logical work flow of MySQL Replication, a transaction (can be a single statement in case of autocommit=1) when executed is written to binlog, which then is signified to the slave for it to copy it to the relay-logs and then coordinates written to the master.info (the point till where IO thread has copied binlogs). Now the above situation suggests that, MySQL didnt get a chance to write the statement to binlog file (i.e. to call "fdatasync") before sending it to the slave. This is where "Binlog Cache" comes in picture.

So a DML when executed inside a transaction with binlogs enabled, will be written to "Binlog cache", and once the transaction completes the "Binlog Cache" is flushed to disk (written to the binlog files). This is the point where the statement got replicated to slave and new position got written to master.info however before flushing it to the binlog files the master crashed. This is not binlog corruption, nor the slave had any clue to jump on to the next binlog file, it just returned the error. 

A fix to this problem would be to execute the change master on slave for new binlog position (the file created after mysql start on master).

There are still some questions unanswered, like :
1) Does the innodb instance recovery rolls back the failed transaction, if it does, then we have an inconsistent slave.
2) Would it happen if the tables in question are MyISAM.
3) How will replication behave if there is no rotate event in binary logs.
4) Could a sync_binlog and sync_relay_log save us from this situation.

I will try to find answers to these questions till then enjoy replicating.

No comments:

Post a Comment