RavenDB 4.0Raven Query Language
The last big feature for RavenDB 4.0 has landed, and it is a big one. You can see the details on the PR that implemented this feature below, but you probably care a lot more about what it is.
RavenDB uses Lucene as the underlying index technology, and until now we simply exposed (slightly modified) Lucene syntax to our clients. That was easy to do and quite effective, but it also meant that we were limited to somewhat arcane query language and what it could do.
In RavenDB 4.0, we now have a full query language, and you can see how this looks like below:
This will be produce the results that you expect, giving you all the companies residing in London in the database.
The rest of the system behaves just the same, this query is going to hit the query optimizer, and index will be created if one does not already exists, etc. It is just that our query language is both much nicer to look at and allow us to work with it in a much more structured manner (and yes, that is a pun).
We also support aggregation:
Which gives:
This is automatically creating a map/reduce index and does all the work for you. We also have support for querying on indexes directly via:
If you are familiar with how we used to have to do range queries, you can see how big an improvement this is. This is actually a pretty significant feature, because you can define a static index to do whatever you want with the data, and then query on top of that.
You can also do the usual full text operations directly in the query language:
We decide to go with the method abstraction for most operations, because it allows us a lot of freedom in the syntax and give very readable queries.
Here is an example of us trying a more complex query. In this one, I want to find companies in London, the UK or France. But instead of just wanting to find them in that particular order, I want to get them with ranking.
I really want a company in London, so that should sort first, and then UK based companies and finally France companies. You can see the results of the query below. This query also show have we can do projections, in a much nicer way.
The feature just landed in the main branch and we are now working through all of the rough edges, but it is very exciting, since it give you a natural way to query RavenDB without losing any of the power.
I mentioned that this was a big change, right?
And that is just for the C# work, we still have to update the other clients.
More posts in "RavenDB 4.0" series:
- (30 Oct 2017) automatic conflict resolution
- (05 Oct 2017) The design of the security error flow
- (03 Oct 2017) The indexing threads
- (02 Oct 2017) Indexing related data
- (29 Sep 2017) Map/reduce
- (22 Sep 2017) Field compression
Comments
Why SELECT first? You should go with LINQ-like syntax, not SQL-like. I remember reading there are quite a lot of reasons why SELECT should go last. Code-completion being one of them.
Yep, the decision of the LINQ team to put the select last, was a good one.
@Oren .. Looks awesome and very powerful! Can't believe how quickly you have knocked this out!!! @Euphoric probably because it's familiar and will encourage more developers from a SQL background to try and use the product.
@Ian Honsetly, I am always annoyed by SQL like syntax. I learned coding before SQl, and SQL will be always backwareds for me. LINQ order is much more logical thant SQLs madness.
Congrats on the new feature. Looks to be a big change that will provide a lot of value. One comment regarding the ranking however, after reading the section and then viewing the query I was surprised to see 3 as indicating a higher rank. With the explanation it makes sense, higher number equals more weight. Intuitively, however, to me the ranking would start at one and go up instead.
Hmm, more lines deleted than added ... How is that possible, for such a big feature?
Maybe someone used global format on the solution and his local settings were opening brace { on the end of statement instead of new line ;)
@Ryan we had to delete the old code that supported lucene syntax so it makes sense.
@Tal, just so I understand what you said there - does this mean that we can no longer query using Lucene syntax (i.e. Advanced.DocumentQuery with Lucene where clause) because that's a substantial breaking change as I understand it? The shift to 4.0 looks pretty big as it is without re-writing a ton of Lucene queries as well...
Euphoric, We put select first because that is how SQL does it, and we want to do it as easy as possible for people to start using this. We are also allowing to place the select at the end of the query, so you can get nicer intellisense and can see query as a series of steps in the order that they are defined, but select first seems very natural for a query language exactly because it is the way SQL does that.
We currently allow both, but we'll decide before release if we want it in the end always.
Ian, It took less then a week to do the parser, then about 3 weeks to refactor the code to use it properly. The actual hard part for us right now is that it enable so many interesting features that it it hard to stop
Jedak, That is how Lucene does this. You can use negative numbers and get closer to zero, I guess, but the idea is that this gives you higher score means better match.
Ian, Anything that you previously could do is still retained. Using the raw syntax, you previously had:
"Username: foo"
Now you can translate that as is to:
Or something similar. For the most part, the client API didn't change at all because of this, although we did add a nicer way to throw the direct lucene syntax there.
Thanks Oren, we have quite a lot of places where we have not used the query builders and write Lucene queries directly by passing in a string where clause. I am not quite sure from your response whether this will still be allowed, or if we need to go through and convert these places to use a different approach?
Comment preview