The Guts n’ Glory of Database InternalsLog shipping and point in time recovery

time to read 3 min | 519 words

image[3]

In my previous post I talked about the journal and how it is used to recover the database state in case of errors.

In other words, we write everything that we are going to do into the journal, and the code is written to run through the journal and apply these changes to the database.

Ponder this for a moment, and consider the implications of applying the same concept elsewhere. Log shipping is basically taking the journal file from one server and asking another server to apply it as well. That is all.

Everything else are mere details that aren’t really interesting… Well, they are, but first you need to grasp what is so special in log shipping. If you already have a journal and a recovery from it, you are probably at least 80% or so of the way there to getting log shipping to work.

Now that you understand what log shipping is, let’s talk about its implications. Depending on the way your log is structured, you might be able to accept logs from multiple servers and apply all of them (accepting the fact that they might conflict), but in practice this is almost never done in this manner. Log shipping typically mandates that one server would be designated the master (statically or dynamically), and the other(s) are designated as secondary (typically read only copies).

The process in which the logs are shipped is interesting. You can do that on a time basis (every 5 – 15 minutes), every time that you close a journal file (64MB – 256MB), etc. This is typically called offline log shipping, because there is a large gap between the master and the secondary. There is also online log shipping, in which every write to the journal is also a socket write to the other server, which accepts the new journal entries, write them to its own journal and applies them immediately, resulting in a much narrower delay between the systems.  Note that this has its own issues, because this is now a distributed system with all that it implies (if the secondary isn’t available for an hour, what does it mean, etc.).

But journals also allow us to do other fun stuff. In particular, if the journal file records the timestamp of transactions (and most do), they allow us to do what is called “point in time” recovery. Starting from a particular backup, apply all the committed transactions until a certain point in time, bring up to the state of the database at 9:07 AM (one minute before someone run an UPDATE statement without the WHERE clause).

Note that all of the code that actually implements this has already been written as part of ensuring that the database can recover from errors, and all we need to implement “point in time” recovery is to just stop at a particular point in time. That is a pretty neat thing, in my opinion.

More posts in "The Guts n’ Glory of Database Internals" series:

  1. (08 Aug 2016) Early lock release
  2. (05 Aug 2016) Merging transactions
  3. (03 Aug 2016) Log shipping and point in time recovery
  4. (02 Aug 2016) What goes inside the transaction journal
  5. (18 Jul 2016) What the disk can do for you
  6. (15 Jul 2016) The curse of old age…
  7. (14 Jul 2016) Backup, restore and the environment…
  8. (11 Jul 2016) The communication protocol
  9. (08 Jul 2016) The enemy of thy database is…
  10. (07 Jul 2016) Writing to a data file
  11. (06 Jul 2016) Getting durable, faster
  12. (01 Jul 2016) Durability in the real world
  13. (30 Jun 2016) Understanding durability with hard disks
  14. (29 Jun 2016) Managing concurrency
  15. (28 Jun 2016) Managing records
  16. (16 Jun 2016) Seeing the forest for the trees
  17. (14 Jun 2016) B+Tree
  18. (09 Jun 2016) The LSM option
  19. (08 Jun 2016) Searching information and file format
  20. (07 Jun 2016) Persisting information