Rob’s SprintIndexes and the death of temporary indexes

time to read 2 min | 357 words

RavenDB’s ability to analyze your queries and generate the required indexes on the fly has always been a great boon. Rob Ashton was involved in the original implementation and during his visits to Hibernating Rhinos’ secret lair, he got to whack that thing on the head a few more times.

We need to separate two important things:

  • Automatically generating the indexes based on your queries.
  • The temporary indexes model itself.

The first part is a really important feature. The second is just an implementation detail. In particular, temporary indexes had a few problems.

Most importantly, they were temporary, and there was an explicit step for promoting those indexes from one stage to the other. That caused some confusion, and there was a period of time, exactly when we decided that the index was important enough to keep, that caused the index to effectively reset itself. The other problem was that the moment that an index was upgraded to an auto index, it was there forever.

What Rob has done was to remove the concept of temporary indexes all together, which got rid of a whole bunch of code. Instead, we have just standard auto indexes. And now we had a drastically simplified story. We didn’t have the drastic jump from temp to auto, with irrecoverable implications.

Of course, this leads to a lot of interesting questions. Temporary indexes had the benefit of being indexed directly to memory, and they would go away after a database restart, as well as a whole lot of stuff. Not having special code for that made things a lot simpler for us, actually.

Automatic indexes have their age, and that is tracked internally by RavenDB. If an automatic indexed isn’t being used, it will become idle an eventually abandoned. If it is a very young index, we will decide it was a temporary index after all, and remove it from the system completely.

This feature, along with idling indexes, opened up the door for the next important feature, index merging. But before that, we need to upgrade the smarts for the query optimizer… which happens to be our next topic.

More posts in "Rob’s Sprint" series:

  1. (08 Mar 2013) The cost of getting data from LevelDB
  2. (07 Mar 2013) Result Transformers
  3. (06 Mar 2013) Query optimizer jumped a grade
  4. (05 Mar 2013) Faster index creation
  5. (04 Mar 2013) Indexes and the death of temporary indexes
  6. (28 Feb 2013) Idly indexing