Artificial documents in RavenDB 4.0

Jun 01 2017

Artificial documents in RavenDB 4.0

time to read 2 min | 228 words

Artificial documents are a really interesting feature. They allow you to define an index, and specify that the result of the index will be… documents as well.

Let us consider the following index, running on the Norhtwind dataset.

We can ask RavenDB to output the result of this index to a collection, in addition to the normal indexing. This is done in the following manner:

And you can see the result here:

The question here is, what is the point? Don’t we already have the exact same data indexed and available as the result of the map/reduce index? Why store it twice?

The answer is quite simple, with the output of the index going into documents, we can now define additional indexes on top of them, which give us the option to very easily create recursive map/reduce operations. So you can do daily/monthly/yearly summaries very cheaply. We can also apply all the usual operations on documents (subscriptions and ETL processes come to mind immediately). That give you a lot of power, and without incurring a high complexity overhead.

Tweet Share Share 7 comments

Tags:

raven
design

Comments

01 Jun 2017
15:31 PM

Dejan Miličić

How does definition of such index look like in c#?

01 Jun 2017
15:56 PM

Oren Eini

Dejan, You define it normally, and there is a properly called OutputReduceToCollection.

Full sample:

 public class DailyInvoicesIndex : AbstractIndexCreationTask<Invoice, DailyInvoice>
 {
     public DailyInvoicesIndex()
     {
         Map = invoices =>
             from invoice in invoices
             select new DailyInvoice
             {
                 Date = invoice.IssuedAt.Date,
                 Amount = invoice.Amount
             };
         Reduce = results =>
             from r in results
             group r by r.Date
             into g
             select new DailyInvoice
             {
                 Date = g.Key,
                 Amount = g.Sum(x => x.Amount)
             };
         OutputReduceToCollection = "DailyInvoices";
     }
 }

01 Jun 2017
16:10 PM

Judah Himango

Whoa - love this feature. It makes the results a little more first-class, and as you say, then we can do stuff like additional indexes on top of that output. Very cool!

01 Jun 2017
16:19 PM

Judah Himango

Couple more thoughts:

I'm curious how the IDs of the artificial documents are generated. The index will cause Orders/1 to output one or more artificial documents into MonthlySalesProduct. If I change Orders/1, the index will...update any existing MonthlySalesProduct(s)? Or wipe out the old ones and generate the new ones? I'm assuming it will update the existing, but that makes me curious how the index knows which Orders correlate to which MonthlySalesProduct.
If I manually change a MonthlySalesProduct, I assume my changes will be overwritten when the index next runs, correct?

01 Jun 2017
16:44 PM

Oren Eini

1) The ids on the documents are generated as a hash of the reduce keys. If the value changed, we'll overwrite it completely. So the key is actually a hash of, in this case, the month and the product id.

2) Yes, correct.

02 Jun 2017
08:00 AM

Pop Catalin

"The ids on the documents are generated as a hash of the reduce keys"

Would it be possible to use the reduce keys to generate the document Id without hashing IE: "MonthlyProductSales/76-12", "DailyInvoices/2017-06-02" ?

02 Jun 2017
08:01 AM

Oren Eini

Pop Catalin, No, because we need those to be consistent with the reduce keys.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB