Graphs in RavenDB: The query language

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (380) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1454) rss
ravendb.net (538) rss
reviews (184) rss

2025
- July (4)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB - High-Performance NoSQL Document Database

Sep 19 2018

Graphs in RavenDBThe query language

time to read 4 min | 781 words

Pretty much all our early discussions about graphs in RavenDB focused on how to build the actual graph implementation. How to allow fast traversal, etc. When we started looking at the actual implementation, we realized that we seriously neglected a very important piece of the puzzle, the query interface for the graphs.

This is important for several reasons. First, ergonomics matter, if we end up with a query language that is awkward, it won’t see much use and complicate the users’ lives (and our own). Second, the query language effectively dictate how the user think about the model, so making low level decisions that would have impact on how the user is actually using this feature is probably not a good idea yet. We need to start from the top, what do we give to the user, and then see how we can make that a reality.

The most common use case of graph queries is the friends of friends query. Let’s see how this query is handled in various existing implementation, shall we?

Neo4J, using Cypher:

OrientDB doesn’t seem to have an easy way to do this. The following shows how you can find the 2nd degree friends, but it doesn’t exclude friends of friends who are already your friends. StackOverflow questions on that show scary amount of code, so I’m going to skip them.

Gremlin, which is used in a wide variety of databases:

We looked at other options, but it seems that graph query languages fall into the following broad categories:

ASCII art to express the relationship between the nodes.
SQL extensions that express the relationships as nested queries.
Method calls to express the traversal.

Of the three options, we found the first option, using ASCII Art / Cypher as the easier one to work with. This is true both in terms of writing the query and actually executing it.

Let’s look at how friends of friends query will look like in RavenDB:

Graph queries are composed of two portions:

With clauses, which determine source point for the graph traversal.
Match clause (singular) that contain the graph pattern that we need to match on.

In the case, above, we are starting the graph traversal from start, this is defined as a with clause. A query can have multiple with clauses, each defining an alias that can be used in the match clause. The match clause, on the other hand, uses these aliases to decide how to process the query.

You can see that we have two clauses in the above query, and the actual processing is done by pattern matching (to me, it make sense to compare it to regular expressions or Prolog). It would probably be easier to show this with an example. Here is the relationship graphs among a few people:

We’ll set the starting point of the graph as Arava and see how this will be processed in the query.

For the first clause, we’ll have:

start (Arava) –> f1 (Oscar) –> f2 (Phoebe)
start (Arava) –> f1 (Oscar) –> f2 (Sunny)
start (Arava) –> f1 (Sunny) –> f2 (Phoebe)
start (Arava) –> f1 (Sunny) –> f2 (Oscar)

For the second clause, of the other hand, have:

start (Arava) –> f2 (Oscar)
start (Arava) –> f2 (Sunny)

These clauses are joined using and not operator. What this means is that we need to exclude from the first clause anything that matches on the second cluase. Match, in this case, means the same alias and value for any existing alias.

Here is what we need up with:

start (Arava) –> f1 (Oscar) –> f2 (Phoebe)
~~start (Arava) –> f1 (Oscar) –> f2 (Sunny)~~
start (Arava) –> f1 (Sunny) –> f2 (Phoebe)
~~start (Arava) –> f1 (Sunny) –> f2 (Oscar)~~

We removed two entries, because they matched the entries from the second clause. The end result being just friends of my friends who aren’t my friends.

The idea with behind the query language is that we want to be high level and allow you to express what you want, and we’ll be in charge of actually making this work properly.

In the next post, I’ll talk a bit more about the query language, what scenarios it enables and how we are going to go about processing queries.

Tweet Share Share 14 comments

Tags:

Comments

19 Sep 2018
11:30 AM

Daniel Crabtree

I really like this syntax, it makes the queries really clear.

Just wondering though, shouldn't the first clause in your example also pick up:

start (Arava) –> f1 (Oscar) –> f2 (Arava)
start (Arava) –> f1 (Sunny) –> f2 (Arava)

19 Sep 2018
11:34 AM

Rafal

SQL is like wine - gains taste, elegance and quality just by sitting there and doing nothing

19 Sep 2018
11:38 AM

svick

Do you really need all that syntax?

For example, couldn't you change from:

(start)-[:FriendsOf]->(f1)-[:FriendsOf]->(f2)

to something simpler, like:

start -FriendsOf-> f1 -FriendsOf-> f2

Or, if queries that repeatedly use the same edge kind are common:

FriendsOf: start->f1->f2

Though I understand that basing your language on an existing language has its advantages too.

19 Sep 2018
12:58 PM

Steve

Is the second naming of f2 required or could it just have been f3? (I know it's a copy from Cypher but just wondering if it needs to be the same) Orient and Gremlin have an explicit dedup/distinct, will this be needed in RavenDB as well or is it implicitly distinct?

20 Sep 2018
06:14 AM

Oren Eini

Daniel, The query wouldn't go back to a node that it already visited

20 Sep 2018
06:15 AM

Oren Eini

Svick, I probably could, yes. However, note that this is the simplest query possible, a lot more complexity isn't discussed yet, so you need to account for that as well. And using something that builds on existing stuff means that people in the field can much more easily grok it.

20 Sep 2018
06:16 AM

Oren Eini

Steve, You need f2 to be the same, otherwise, we won't know how to filter the already existing friends.

20 Sep 2018
09:36 AM

Daniel Crabtree

I thought that might be the case.

Assuming a model like Twitter where follows is directional, is it still possible with this query language to write a query like find everyone I follow that follows me back?

20 Sep 2018
09:38 AM

Oren Eini

Daniel, You want something like, find all the users that follow me that I also follow? You can do that using: match (me:Account)-[:Follows]->(other:Account)-[:Follows]->(me)

20 Sep 2018
11:09 AM

Daniel Crabtree

Awesome.

The default restriction that f2 != start makes perfect sense now.

20 Sep 2018
13:36 PM

peter

@Rafal, no doubt tongue-in-cheek, but still SQL is still fermenting it seems, nowadays supports graph queries, e.g.:

SELECT Person3.name AS FriendName
FROM Person Person1, Person Person2, friends, friends friends2, Person Person3
WHERE MATCH(Person1-(friends)->Person2-(friends2)->Person3)
AND Person1.name = 'Marek Masko';

21 Sep 2018
10:10 AM

Rafal

@peter then SQL-92 is the standard :) and it already has the syntax for joins to do graph traversal. I realize its a language for processing particular model of data, not applicable to every graph database, but still it's built on top of well defined abstractions and syntax reflects that in a clear way. And here we're talking about some ad-hoc ASCII art for expressing some connections between data, but without building the 'algebra' out of it. So pretty soon you will have to add new symbols to the language because it's so simple. For example, how do you express that some node is reachable/unreachable from another node in any number of steps? or there is a cycle?

21 Sep 2018
16:42 PM

Oren Eini

Rafal, Actually, you pretty much have answers (and syntax) for all of these in languages like Cypher. Providing a range of steps is something like: (user)-[:FriendOf {*..2}]->(friend)

Graph problems are a very well studied field.

21 Sep 2018
16:42 PM

Oren Eini

Rafal, Actually, you pretty much have answers (and syntax) for all of these in languages like Cypher. Providing a range of steps is something like: (user)-[:FriendOf {*..2}]->(friend)

Graph problems are a very well studied field.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Graphs in RavenDBThe query language

More posts in "Graphs in RavenDB" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

More posts in "Graphs in RavenDB" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication