nosync – turn off sync and O_DIRECT with LD_PRELOAD

nosync turns off a stack of amazing safety measures that ensure that you never lose data, even if your machine crashes or loses power.

Sometimes you need speed and not safety. Sometimes, safety concerns can cause what you are doing to fail, because it goes too slowly. For days like that, this software will throw safety out of the window. Run it, and pray that you do not lose power or crash. Disclaimer: this software will eat your data. Beyond that there are no guarantees.

Download:

d176cc82232fb3b26a285f0a131de4cb nosync-0.1.tar.gz

Taking nosync for a spin

This is normal behaviour – sync sync()’s:

$ strace sync 2>&1 | grep sync
execve("/bin/sync", ["sync"], [/* 40 vars */]) = 0
sync()                                  = 0

This is what happens with nosync – sync doesn’t sync():

$ nosync strace sync 2>&1 | grep sync
execve("/bin/sync", ["sync"], [/* 39 vars */]) = 0
open("/usr/local/bin/nosync.so", O_RDONLY) = 3

Wehn is speed betetr tahn sfeaty?

When you are restoring a huge mysql database, you will find that it can take not minutes, but hours and days to get it handled. A lot of the wait time is spent making sure that at every moment of every step, absolutely everything is in sync. But you don’t care. If it’s not all restored, the job is not done. It only has to be in sync at the end. On days like this, you don’t need safety in the event of unexpected power loss: you need raw SPEED. You are already doing a restore on another computer with all the safety measures in place, in case this one doesn’t work out, of course. However, if this one finishes first, then it’s the one we’re going to use, because we need the system up!

To get some decent speed, you need to disable all the safety features of mysql that insist that data must really properly get onto the disk, and it should get onto the disk in the correct order.

Database engines strive towards ACID compliance, but this is not an entirely appropriate goal for days when you need to restore from backup and nothing is happening until that restore from backup completes:

  1. Atomicity: no, we don’t care if these transactions get mixed up. It’s a dump, for crying out loud – there are no inter-dependencies. As far as the sysadmin is concerned the whole backup restore is one transaction, but the engine likes to think that transactions will be rolled back, so keeping them in suspense for all those GBytes is not going to work.
  2. Consistency: we can wait for this – as long as it is consistent at the end, we are happy. Wild oscillations in the practical consistency won’t bother us at all as long as we get the data into the system.
  3. Isolation: the system is off-line while we wait for this restore. There is nothing to interact with!
  4. Durability: we want it to be durable from the moment we switch over. Before that we are prepared to start again.

You can change to “occasional sync” with this configuration setting:

innodb_flush_method = O_DSYNC

But we need it to be simpler, better and faster than that (or just faster).

The solution of the day is a LD_PRELOAD module that nerfs a few system calls that force writing to disk. The system calls that get all their sync-ness and O_DIRECT-ness nerfed away are:

  • sync
  • fsync
  • fdatasync
  • sync_file_range
  • open
  • open64
  • openat
  • openat64
  • fcntl(...,F_SETFL)

For all of these, the overridden function silently drops O_DIRECT and O_SYNC and says that it did nothing of the sort.

Gotcha

If you do this, it might not work if you are running upstart, which is default on current Ubuntu distributions:

/etc/init.d/mysql stop
nosync /etc/init.d/mysql start

This fails because init starts the process. It is not started by your shell script. Rather start mysql yourself like this, and happiness will follow:

/etc/init.d/mysql stop
nosync mysqld &

Actually, wait, happiness does not follow. This error follows in /var/log/syslog:

kernel: [   86.460368] type=1400 audit(1323365994.273:27): 
apparmor="DENIED" operation="open" parent=2607 
profile="/usr/sbin/mysqld" name="/usr/local/bin/nosync.so" 
pid=2656 comm="mysqld" requested_mask="r" denied_mask="r" 
fsuid=0 ouid=0

So apparmor noticed that mysqld was loading something unusual, and stomped it. That’s not altogether bad. To make it happy, you have to edit /etc/apparmor.d/usr.sbin.mysqld and allow it to load the preload file:

  /usr/sbin/mysqld mr,
  /usr/local/bin/nosync.so mr,
  /usr/share/mysql/** r,

Alternatives

While looking around to see this page has become famous (it hasn’t), I found these similar efforts:

Leave a Reply

Your email address will not be published. Required fields are marked *