Awesome RavenDB feature of the Day, Encryption
Another upcoming feature for the next RavenDB release is full encryption support. We got several requests for this feature from customers that are interested in using RavenDB to store highly critical data. Think financial records, health information, etc.
For those sort of applications, there are often regulatory concerns about the data at rest. I’ll state upfront that I think that a lot of those regulations make absolutely no sense from a practical standpoint, but…
The end result is that we can never put any plaintext data on disk. Which result in some interesting problems for our use case. To start with, it is fairly easy to go about creating encryption for documents. They are independent from one another and are usually read / written in a single chunk. In fact, there have been several articles published on exactly how to do that. The problem is with indexes, which are not read as a whole, in fact, we have to support random reads through it.
In the currently released version, you could encrypt documents using a custom bundle, but you can’t get real encryption across the board. This like in flight documents, partial map/reduce data and the indexes themselves will not be encrypted and be saved in plain text format, even with a custom bundle.
In the next version of RavenDB (this feature will be available for the Enterprise version), we have made sure that all of that just works. Everything is encrypted, and there is no plain text data on disk for any reason. RavenDB will transparently encrypt/decrypt the data for you when it is actually sent to disk.
By default, we use AES-128 (you can change that, if you want, but there is a not insignificant hit if you want to just to AES-256 and it is just as secure, barring a quantum computer) to encrypt the data.
The funny part (or not so funny part) is that the actual process of encrypting the data was a relatively straightforward process. We had to spend a lot more time & effort on the actual management aspect of this feature.
For example, encryption requires an encryption key, so how do you manage that?
In RavenDB, we have two types of configurations. Server wide, which is usually located at the App.config file and database specific, which is located at the System database. For the App.config file, we provide support for encrypting the file using DPAPI, using the standard .NET config file encryption system. For database specific values, we provide our own support for encrypting the values using DPAPI.
So, the end result is:
- Your documents and indexes are encrypted when they are on disk using strong encryption.
- You can use a server wide or database specific key for the encryption (for that matter, you turn on/off encryption at the database level).
- Your encryption key is guarded using DPAPI.
- Obviously, you should backup the encryption key, because we have no way of recovering your data without it.
- The data is safely encrypted on disk, and the OS guarantee that no one can access the encryption key.
And, finally: You get to tick off the “no plaintext data at rest” checkbox and move on to do actual feature development .
Comments
Does this kill indexing performance? I would assume RavenDB needs to decrypt each document in order to index it, then encrypt the index information correct?
That process must be slow?
This is important: if you use this bundle, backup your keys. They will not be backed up together with the rest of the database. Back them up, or your database backup is completely worthless.
Phillip: in most cases, encryption is faster than I/O. I didn't notice a major difference when working on the encryption bundle*, and I did quite a bit of testing with and without it - generally ran a 15-minute test suite on it, and it took roughly the same time, although I didn't do any real performance testing.
Oh, I didn't know this blog uses markdown. Imagine that bullet point was just a star.
@configurator - that's awesome!
Sounds great.
But if I understand this correctly: this is a security feature in order to protect the database files yes?
Does this mean I still can acces the data via the studio? or does it need the key to show the data.
Ed, There is a major difference between authentication / authorization, which is deciding who can reach the database and what data they can read and this feature. This is meant to ensure that you can comply with various regulations that require that you'll never have clear text data at rest. That is, if someone steals the hard drive, they don't get to do anything with it.
Ayende, I certainly see applications for this kind of security.
Disclaimer: Everything I know about RavenDB is from this blog - I admit I haven't actually used the product... In that light this may seem silly but had you looked at using something like EFS encryption built into NTFS for this purpose? I guess this would work well for the server product where you could run the service under a dedicated service account but it would be unworkable (or at least difficult) for anything embedded. I also don't know about guaranteed writes to the file system under EFS and whether or not they're affected, and if so, if that would upset RavenDB or make it less reliable.
Ian, You can't always use encrypted file system, some SAN do not support it, for example. And there are often regulations that require something beyond that.
In particular, just using encryption at the file system level doesn't protect you from having the clear text in the backup.
There are actually a lot of use cases where this kind of encryption totally makes sense. Think about your country's intelligence agencies as an example.
Comment preview