The importance of a data formatPart II–The environment matters
When designing a new data format, it is important to remember in what environment we’ll operate in, what are the requirements and what type of scenarios we’ll face.
With RavenDB, we are talking about the internal storage format, so it isn’t something that is available externally. That means that we don’t have to worry about interchange with anything, that frees up the design by quite a bit. We want to reduce parsing costs, we want to reduce size on disk and we want to reduce managed allocations.
That leads us to a packed binary format, but not all binary formats are born equal. In particular, we need to consider whatever we’ll have a streaming format or a complete format. What is the meaning of that?
A streaming format means that you read it one byte at a time to construct the full details. JSON is a streaming format, for example. That is not something that we want to do, because a streaming format requires us to have an in memory representation to deal with the object. And even if we wanted a known particular value from the document, we would still need to parse through all the document to get all the relevant fields.
So we want a complete format. A complete format means that we don’t need to parse the document to get to a particular value. Internally, we refer to such a format as Blittable. I’m not fond of this term, and I would appreciate suggestions to replace it.
I’ll get to the details about how this is actually laid out in my next post, in this post, I want to outline the overall design for it.
We want a format that can be read in a single call (or, more commonly for us, mmap in its entirety), and once that is done, we can start working with it without additional work. Traversing through this format should be a cheap operation, so this code:
foreach(var child in doc.children)
{ Console.WriteLine(child.firstName); }
Should only materialize the strings for the children’s names (which we accessed), but will have no further costs regarding the rest of the document.
Because we assume that the full document will reside in memory (either by loading it all from disk or by mmaping it), we don’t need to worry about costs of traversing through the document. We can simply and cheaply jump around inside the document.
In other words, we don’t have to put related information close, if we have reason to place it elsewhere. In order to reduce memory consumption during the write phase, we need to make sure that we are mostly forward only writers. That is, the process of writing the document in the new format should not require us to hold the entire document in memory. We should also take the time to reduce the size of the document as much as possible. At the same time, just compressing the whole thing isn’t going to be good for us, we’ll lose the ability to just go to any location on the document cheaply.
Note that for the purpose of this work, we are interested in reducing work only for a single document. There are additional optimizations that we can apply across multiple documents, but they are complex to manage in a dynamic system.
So this is the setup, the previous post talked about the problem we have with JSON, and this one about what kind of a solution we want to have. Next post will discuss the actual format.
More posts in "The importance of a data format" series:
- (25 Jan 2016) Part VII–Final benchmarks
- (15 Jan 2016) Part VI – When two orders of magnitude aren't enough
- (13 Jan 2016) Part V – The end result
- (12 Jan 2016) Part IV – Benchmarking the solution
- (11 Jan 2016) Part III – The solution
- (08 Jan 2016) Part II–The environment matters
- (07 Jan 2016) Part I – Current state problems
 

Comments
This looks to become like http://google.github.io/flatbuffers/md__benchmarks.html and https://github.com/sandstorm-io/capnproto
cheers, </wqw>
wqw, Yes, that was the original impetus. I wrote about ti a while back. We wrote our own implementation for this because we need to have deep control over everything that is going on there
Well, you need to interchange with future and past versions :)
Yep, this was also one of the incentives for developing LMDB in OpenLDAP. I wanted an LDAP entry format that we can simply mmap and use directly, with zero parsing/deserialization required. The current back-mdb backend still requires a small amount of processing to build the in-memory scaffolding around the on-disk structures, but it's quite cheap - just a quick walk thru an array of integer types and lengths. (And ultimately it made more sense than storing a lot of 64bit pointers all over the disk.) back-mdb can retrieve an entry from LMDB and "parse" it and return it to a client 50% faster than our old BerkeleyDB-based backend can return a fully-cached entry from RAM. And it only requires walking the type/length array, the actual entry data doesn't need to be touched until it's actually written over a socket to the client.
Comment preview