In my today’s post i’d like to touch on Redis persistence mechanisms.
What we can choose from are basically two options (or the combination of those):
- The RDB persistence – which performs point-in-time snapshots of your dataset at specified intervals.
- The AOF (append-only file) persistence – which logs every write operation received by the server, that can later be “played” again at server startup, reconstructing the original dataset (commands are logged using the same format as the Redis protocol itself).
Both of those options are controlled by two different groups of configuration settings in the redis.conf file:
- RDB persistence:
- save <seconds> <changes> – saving the DB on disk – the command will save the DB if both the given number of seconds and the given number of write operations against the DB occurred. You can have multiple save configurations “stacked” one after another, handling saves in different “seconds/changes” scenarios or you can disable saving at all commenting out all the “save” lines.
- stop-writes-on-bgsave-error <yes|no> – by default Redis will stop accepting writes if RDB snapshots are enabled (at least one save point) and the latest background save failed. This will make the user aware (in an hard way) that data is not persisting on disk properly. If the background saving process will start working again Redis will automatically allow writes again.
- rdbcompression <yes|no> – compression of string objects using LZF when dump .rdb databases.
- rdbchecksum <yes|no> – since version 5 of RDB a CRC64 checksum is placed at the end of the file which makes the format more resistant to corruption but there is a performance hit to pay (around 10%) when saving and loading RDB files.
- dbfilename <name> – The filename (default dump.rdb) where to dump the DB.
- dir <path> – The working directory (default value is ./) where the DB will be written. The Append Only File will also be created inside this directory
- AOF persistence:
- appendonly <yes|no> – controls whether AOF mode should be turned on. By default Redis asynchronously
dumps the dataset on disk (RDB Persistence) which is a mode good enough in many applications, but an issue with the Redis process or
a power outage may result into a few minutes of writes lost (depending on
the configured save points). AOF provides much better durability. Using the default data
fsync policy Redis can lose just one second of writes in a dramatic event like a server power outage,
or a single write if something wrong with the Redis process itself happens, but the operating system
is still running correctly. AOF and RDB persistence can be enabled at the same and they play very nicely
together. If the AOF is enabled on startup Redis will load the AOF, that is the file with the better
durability guarantees.
- appendfilename <name> – The name of the append only file (default: “appendonly.aof”)
-
appendfsync <mode> – mode in which fsync should operate. The fsync() call tells the Operating System to actually write data on disk
instead to wait for more data in the output buffer. Some OS will really flush
data on disk, some other OS will just try to do it ASAP. Redis supports three different modes:
- <no>: don’t fsync, just let the OS flush the data when it wants. Faster.
- <always>: fsync after every write to the append only log . Slow, Safest.
- <everysec>: fsync only one time every second. Compromise. (default)
- no-appendfsync-on-rewrite <yes|no> – when the AOF fsync policy is set to always or everysec, and a
background saving process (a background save or AOF log background rewriting) is
performing a lot of I/O against the disk, in some Linux configurations R
edis may block too long on the fsync() call. In order to mitigate this problem it’s possible to use this option
which will prevent fsync() from being called in the main process while a BGSAVE or BGREWRITEAOF is in progress.
In practical terms, this means that it is
possible to lose up to 30 seconds of log in the worst scenario (with the default
Linux settings).
- auto-aof-rewrite-percentage <percentage> and auto-aof-rewrite-min-size <size> – are both related to automatic rewrite of the append only file. Redis is able to automatically rewrite the log file (implicitly calling BGREWRITEAOF) when the AOF log size grows by the specified percentage. This is how it works: Redis remembers the size of the AOF file after the latest rewrite (if no rewrite has happened since the restart, the size of the AOF at startup is used). This base size is compared to the current size. If the current size is bigger than the specified percentage, the rewrite is triggered. Also you need to specify a minimal size for the AOF file to be rewritten, this is useful to avoid rewriting the AOF file even if the percentage increase is reached but it is still pretty small. Specify a percentage of zero in order to disable the automatic AOF rewrite feature.
- appendonly <yes|no> – controls whether AOF mode should be turned on. By default Redis asynchronously
Advantages and disadvantages of both methods (redis.io):
- RDB advantages
- RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
- RDB is very good for disaster recovery, being a single compact file can be transfered to far data centers, or on Amazon S3 (possibly encrypted).
- RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.
- RDB allows faster restarts with big datasets compared to AOF.
- RDB disadvantages
- RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you’ll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.
- RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.
- AOF advantages
- Using AOF Redis is much more durable: you can have different fsync policies: no fsync at all, fsync every second, fsync at every query. With the default policy of fsync every second write performances are still great (fsync is performed using a background thread and the main thread will try hard to perform writes when no fsync is in progress.) but you can only lose one second worth of writes.
- The AOF log is an append only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with an half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
- Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.
- AOF contains a log of all the operations one after the other in an easy to understand and parse format. You can even easily export an AOF file. For instance even if you flushed everything for an error using a FLUSHALL command, if no rewrite of the log was performed in the meantime you can still save your data set just stopping the server, removing the latest command, and restarting Redis again.
- AOF disadvantages
- AOF files are usually bigger than the equivalent RDB files for the same dataset.
- AOF can be slower then RDB depending on the exact fsync policy. In general with fsync set to every second performances are still very high, and with fsync disabled it should be exactly as fast as RDB even under high load. Still RDB is able to provide more guarantees about the maximum latency even in the case of an huge write load.
- Redis AOF works incrementally updating an existing state, like MySQL or MongoDB does, while the RDB snapshotting creates everything from scratch again and again, that is conceptually more robust.
The general advice from Redis team is that you should use both persistence methods if you want a degree of data safety comparable to what PostgreSQL can provide you.
Take care!
Resources:
- Redis.io documentation – persistence (http://redis.io/topics/persistence)
- Redis persistence demystified (http://oldblog.antirez.com/post/redis-persistence-demystified.html)
[…] Redis Persistence – mariuszprzydatek.com | in search of … – stop-writes-on-bgsave-error – by default Redis will stop accepting writes if RDB snapshots are enabled (at least one save point) and the latest background save failed. This will make the user aware (in an hard way) … […]