Raven and client linq indexes, or: We hate strings, too
When I designed Raven, one of the things that were very clear to me was that I wanted to be able to take advantage on existing features in the framework to the greatest degree possible. This means that I don’t have to reinvent the wheel, and it means that Raven’s users will be able to understand what is going on and how to utilize Raven much more easily.
One of those decisions was to use Linq as the format for defining indexes. All other document databases use Javascript as their index definition format (yes, I know you can use Ruby for CouchDB, that isn’t the default / common approach). But .NET already has this nice syntax, and it already gives me so much information OOTB, and users already know how to make linq do all sort of crazy stuff. That reduce support questions, not to mention that it is a sexy little feature.
We are running those linq queries on the server (doing dynamic compilations and a bunch of other stuff). The problem is how to get them to the server. We have a really nice UI, to do so, of course:
But when it comes right down to it, it is a couple of text boxes, and users may want to be able to define indexes via code. That is perfectly understandable, but it means that users have to do something like this:
documentStore.DatabaseCommands.PutIndex( "UsersByRegion", new IndexDefinition() { Map = @"from user in docs.Users select new {user.Region}", });
If you cringe when you see that, welcome to the club. This is still more or less okay, but indexes can get complicated, like this guy:
store.DatabaseCommands.PutIndex("GameEventCountZoneBySpecificCharacter", new IndexDefinition { Map = @"from doc in docs where doc.DataUploadId != null && doc.RealmName != null && doc.Region != null && doc.CharacterName != null && doc.Zone != null && doc.SubZone != null select new { DataUploadId = doc.DataUploadId, RealmName = doc.RealmName, Region = doc.Region, CharacterName = doc.CharacterName, Zone = doc.Zone, Count = 1 };", Reduce = @"from result in results group result by new { DataUploadId = result.DataUploadId, RealmName = result.RealmName, Region = result.Region, CharacterName = result.CharacterName, Zone = result.Zone } into g select new { DataUploadId = g.Key.DataUploadId, RealmName = g.Key.RealmName, Region = g.Key.Region, CharacterName = g.Key.CharacterName, Zone = g.Key.Zone, Count = g.Sum(x => x.Count) };"});
I could tolerate the first example, I couldn’t tolerate this one. I spent a lot of time exploring how I can get things setup so you’ll be able to use linq (including all the usual strong typing goodies) on the client to define server indexes. The short answer is that a complete solution would take about a month of development. A workable solution can be used by serializing expression tree on the wire. But that would produce the following index definition on the server:
That is not an acceptable option for me.
But if we limit the scope to “just get it working and accept that it won’t handle 100% of the cases”, the problem become much easier, and we can now do this:
documentStore.DatabaseCommands.PutIndex("UsersByLocation", new IndexDefinition<LinqIndexesFromClient.User> { Map = users => from user in users select new { user.Region } });
And that would show up on the server as:
It isn’t the original expression, but it is clear enough, I think.
What about that monster query? We can now write it like this:
documentStore.DatabaseCommands.PutIndex("GameEventCountZoneBySpecificCharacter", new IndexDefinition<Game.GameEvent, Game.GameEventCount> { Map = docs => from doc in docs where doc.DataUploadId != null && doc.RealmName != null && doc.Region != null && doc.CharacterName != null && doc.Zone != null && doc.SubZone != null select new { doc.DataUploadId, doc.RealmName, doc.Region, doc.CharacterName, doc.Zone, Count = 1 }, Reduce = results => from result in results group result by new { result.DataUploadId, result.RealmName, result.Region, result.CharacterName, result.Zone } into g select new { g.Key.DataUploadId, g.Key.RealmName, g.Key.Region, g.Key.CharacterName, g.Key.Zone, Count = g.Sum(x => x.Count) } });
Well, this will turn into this:
It isn’t as nice as the original query, I’ll admit, but it is still highly readable.
And yes, it takes scary code to get there :-)
Comments
I don't know why but I actually use the extension methods more than direct linq syntax. It just feels more natural to me. So you approach is absolutely beautiful.
I'm with @Ngoc, I actually prefer extension method syntax to pure LINQ. The fact that we can use either is just a testament to how well-designed, powerful and expressive LINQ is. Big kudos to Erik Meijer for this killer feature.
Really? A whole 'EiniMonth'? Hmmm, I can't shake the feeling there is a relativistic time dilation associated with any such measure.
J Healy,
Yes, a month. Linq is freaking complicated.
The second version of that monster query...are Game.GameEvent, Game.GameEventCount the types that are used to make the LINQ-Queries strong-typed? So, they could then become out of sync with the actual documents (even though the same goes for the string-style indexes, I suppose)? Are they merely placeholders, to be defined by a Raven user, or...
In that last query, GameEvent is the document itself, and yes - GameEventCount is a separate class to help with the Linq, but it only as one property as it inherits from GameEvent to get all the other ones.
Thus no out of sync problems to worry about.
Is there any plan to have IQueryable support for client-side queries ? or to have Count method in IDocumentQuery <t ?
Guillaume,
IDocumentQuery have TotalResults property that you can access.
As for IQueryable, probably, but it is a low priority item at the moment.
Why not just pretty-print the expression tree when you show it to the user? That seems like a more direct and robust solution..
Max,
Show me the code to do so, and the resulting output
Looking at the ravendb code where should I start if I wanted to extend on the server side, e.g. drop in mymapreduce.dll I dont think I'll ever be able to serialize my map reduce functions.
Matt,
Look at the CompiledIndex test
Comment preview