Rob’s SprintIndexes and the death of temporary indexes
RavenDB’s ability to analyze your queries and generate the required indexes on the fly has always been a great boon. Rob Ashton was involved in the original implementation and during his visits to Hibernating Rhinos’ secret lair, he got to whack that thing on the head a few more times.
We need to separate two important things:
- Automatically generating the indexes based on your queries.
- The temporary indexes model itself.
The first part is a really important feature. The second is just an implementation detail. In particular, temporary indexes had a few problems.
Most importantly, they were temporary, and there was an explicit step for promoting those indexes from one stage to the other. That caused some confusion, and there was a period of time, exactly when we decided that the index was important enough to keep, that caused the index to effectively reset itself. The other problem was that the moment that an index was upgraded to an auto index, it was there forever.
What Rob has done was to remove the concept of temporary indexes all together, which got rid of a whole bunch of code. Instead, we have just standard auto indexes. And now we had a drastically simplified story. We didn’t have the drastic jump from temp to auto, with irrecoverable implications.
Of course, this leads to a lot of interesting questions. Temporary indexes had the benefit of being indexed directly to memory, and they would go away after a database restart, as well as a whole lot of stuff. Not having special code for that made things a lot simpler for us, actually.
Automatic indexes have their age, and that is tracked internally by RavenDB. If an automatic indexed isn’t being used, it will become idle an eventually abandoned. If it is a very young index, we will decide it was a temporary index after all, and remove it from the system completely.
This feature, along with idling indexes, opened up the door for the next important feature, index merging. But before that, we need to upgrade the smarts for the query optimizer… which happens to be our next topic.
More posts in "Rob’s Sprint" series:
- (08 Mar 2013) The cost of getting data from LevelDB
- (07 Mar 2013) Result Transformers
- (06 Mar 2013) Query optimizer jumped a grade
- (05 Mar 2013) Faster index creation
- (04 Mar 2013) Indexes and the death of temporary indexes
- (28 Feb 2013) Idly indexing
Comments
It's always nice when you can drop a half of your code just to make something better with less code :)
Please don't remove explicitly created indexes. I have a site that is very rarely used. It would mean the site stops working when RavenDB decides to "clean up".
Daniel, We never remove explicit indexes, only automatic ones.
what happens when a very expensive dynamic query is fired? an automatic index will be created and after some (read: a lot) time, it will not be stale. given if the user doesn't run this query for some time (enough for raven to clean it up), and then runs this request again, wouldn't raven have to pay the cost of indexing everything again for this dynamic query?
Afif, There is not such thing as an expensive query in RavenDB. But yes, if you make a request one a day on something that isn't covered by any index, the behavior will be as you describe. Auto indexes are meant to help the common case, they can't help everywhere. In that scenario, it is the user responsibility to make sure that they will create a permanent index.
Comment preview