Making assumptions, design decisions and critical functionality
In the world of design (be it software or otherwise), being able to make assumptions is a good thing. If I can’t assume something, I have to handle it. For example, if I can assume a competent administrator, I don’t need to write code to handle a disk full error. A competent admin will never let that scenario to happen, right?
In some cases, such assumptions are critical to being able to design a system at all. In physics, you’ll often run into questions involving spherical objects in vacuum, for example. That allows us to drastically simplify the problem. But you know what they say about assuming, right? I’m not a physicist, but I think it is safe to say most applied physics don’t involve spherical objects in vacuum. I am a developer, and I can tell you that if you skip handling a disk full due to assumption of competent admin, you won’t pass a code review for production code anywhere.
And that leads me to the trigger for this post. We have Howard Chu, who I have quite a bit of respect for, with the following statements:
People still don't understand that dynamically growing the DB is stupid. You store the DB on a filesystem partition somewhere. You know how much free space you want to allow for the DB. Set the DB maxsize to that. Done. No further I/O overhead for growth required.
Whether you grow dynamically or preallocate, there is a maximum size of free space on your storage system that you can't exceed. Set the DB maxsize in advance, avoid all the overhead of dynamically allocating space. Remember this is all about *efficiency*, no wasted work.
I have learned quite a lot from Howard, and I very strongly disagree with the above line of thinking.
Proof by contradiction: RavenDB is capable of handling dynamically extending the disk size of the machine on the fly. You can watch it here, it’s part of a longer video, but you just need to watch it for a single minute to see how I can extend the disk size on the system while it is running and can immediately make use of this functionality. With RavenDB Cloud, we monitor the disk size on the fly and extend it automatically. It means that you can start with a small disk and have it grow as you data size increase, without having to figure out up front how much disk space you’ll need. And the best part, you have exactly zero downtime while this is going on.
Howard is correct that being able to set the DB max size at the time that you pen it will simplify things significantly. There is non trivial amount of dancing about that RavenDB has to do in order to achieve this functionality. I consider the ability to dynamically extend the size required for RavenDB a mandatory feature, because it simplify the life of the operators and make it easier to use RavenDB. You don’t have to ask the user a question that they don’t have enough information to answer very early in the process. RavenDB will Just Work, and be able to use as much of your hardware as you have available. And as you can see in the video, be able to take advantage of flexible hardware arrangements on the fly.
I have two other issues that I disagree with Howard on:
“You know how much free space you want to allow for the DB” – that is the key assumption that I disagree with. You typically don’t know that. I think that if you are deploying an LDAP server, which is one of Howard’s key scenarios, you’ll likely have a good idea about sizing upfront. However, for most scenarios, there is really no way to tell upfront. There is also another aspect. Having to allocate a chuck of disk space upfront is a hostile act for the user. Leaving aside the fact that you ask a question they cannot answer (which they will resent you for), having to allocate 10GB to store a little bit of data (because the user will not try to compute an optimal value) is going to give a bad impression on the database. “Oh, you need so much space to store so little data.”
In terms of efficiencies, that means that I can safely start very small and grow as needed, so I’m never surprising the user with a unexpected disk utilization or forcing them to hit arbitrary limits. For doing things like tests, ad-hoc operations or just normal non predictable workloads, that gives you a lot of advantages.
“…avoid the overhead of dynamically allocating space” – There is complexity involved in being able to dynamically grow the space, yes, but there isn’t really much (or any) overhead. Where Howard’s code will return an ENOSPC error, mine will allocate the new disk space, map it and move on. Only when you run out of the allocated space will you run into issues. And that turn out to be rare enough. Because it is an expensive operation, we don’t do this often. We double the size of the space allocated (starting from 256KB by default) on each hit, all the way to the 1 GB mark, after which we allocate a GB range each time. What this means is that in terms of the actual information we give to the file system, we do big allocations, allowing the file system to optimize the way the data is laid out on the physical disk.
I think that the expected use case and deployment models are very different for my databases and Howard’s, and that lead to a very different world view about what are the acceptable assumptions you can make.
Comments
i dont think its so important - basically, if you grow by factor of 2 every time you'll quickly end up with some size that's good enough and will be good enough for a long time (if your database is growing linear in time). You will start with quite frequent resizes but after a year your next resize will be probably a year away. So there might be some savings on users that have no idea how much data they're going to have but the more experienced ones can make a guess and allocate upfront without having to worry about it for a year or two.
Rafal,
We grow by a factor of 2 only until we get to 1GB, then we grow by 1GB at a time.If you have linear growth, you'll see resizes often.
If Howard means by "preallocate" a non-sparse allocation, that simplification may come at significant expense, especially for modern pay-as-you-go platforms. I suppose for a huge corporation with deep pockets the simplification is the overriding factor
Peter,
Yes, for cloud systems, pre-allocation of the data can be a huge cost burden.
And the issue isn't simplification for the user, it is for the dev.
LMDB was really designed for more-or-less fixed sizes.
https://symas.com/understanding-lmdb-database-file-sizes-and-memory-utilization/
snips: ... in a system where the net number of entries in a database remains the same... ... we see growth to a point where an operating equilibrium is attained... ...while it is prudent to monitor the memory and disk footprint of an LMDB database..
Peter,
Yes, very different model from what you'll use RavenDB for.
Comment preview