Artificial documents in RavenDB 4.0
Artificial documents are a really interesting feature. They allow you to define an index, and specify that the result of the index will be… documents as well.
Let us consider the following index, running on the Norhtwind dataset.
We can ask RavenDB to output the result of this index to a collection, in addition to the normal indexing. This is done in the following manner:
And you can see the result here:
The question here is, what is the point? Don’t we already have the exact same data indexed and available as the result of the map/reduce index? Why store it twice?
The answer is quite simple, with the output of the index going into documents, we can now define additional indexes on top of them, which give us the option to very easily create recursive map/reduce operations. So you can do daily/monthly/yearly summaries very cheaply. We can also apply all the usual operations on documents (subscriptions and ETL processes come to mind immediately). That give you a lot of power, and without incurring a high complexity overhead.
Comments
How does definition of such index look like in c#?
Dejan, You define it normally, and there is a properly called
OutputReduceToCollection
.Full sample:
Whoa - love this feature. It makes the results a little more first-class, and as you say, then we can do stuff like additional indexes on top of that output. Very cool!
Couple more thoughts:
I'm curious how the IDs of the artificial documents are generated. The index will cause Orders/1 to output one or more artificial documents into MonthlySalesProduct. If I change Orders/1, the index will...update any existing MonthlySalesProduct(s)? Or wipe out the old ones and generate the new ones? I'm assuming it will update the existing, but that makes me curious how the index knows which Orders correlate to which MonthlySalesProduct.
If I manually change a MonthlySalesProduct, I assume my changes will be overwritten when the index next runs, correct?
1) The ids on the documents are generated as a hash of the reduce keys. If the value changed, we'll overwrite it completely. So the key is actually a hash of, in this case, the month and the product id.
2) Yes, correct.
"The ids on the documents are generated as a hash of the reduce keys"
Would it be possible to use the reduce keys to generate the document Id without hashing IE: "MonthlyProductSales/76-12", "DailyInvoices/2017-06-02" ?
Pop Catalin, No, because we need those to be consistent with the reduce keys.
Comment preview