RavenDB performance optimizations

time to read 2 min | 262 words

Just to note, you’ll probably read this post about a month after the change was actually committed.

I spent the day working on a very simple task, reducing the number of writes that RavenDB makes when we perform a PUT operation. I managed to reduce one write operation from the process, but it took a lot of work.

I thought that I might show you what removing a single write operation means, so I built a simple test harness to give me consistent numbers (in the source, look for Raven.Performance).

Please note that the perf numbers are for vanilla RavenDB, with the default configuration, running in debug mode. We can do better than that, but what I am interested in is not absolute numbers, but the change in those numbers.

Here are the results for build 124, before the change:

Wrote 5,163 documents in 5,134ms: 1.01: docs/ms
Finished indexing in 8,032ms after last document write

And here are the numbers for build 126, after the change:

Wrote 5,163 documents in 2,559ms: 2.02: docs/ms
Finished indexing in 2,697ms after last document write

So we get double the speed at write time, but we also get much better indexing speed, this is sort of an accidental by product, because now we index documents based on range, rather than on specific key. But it is a very pleasant accident.

Tweet Share Share 15 comments

Tags:

Raven

Comments

08 Sep 2010
11:39 AM

Demis Bellot

Pretty good results ayende, should satisfy a lot of use-cases.

What type of documents are you using in these benchmarks?

08 Sep 2010
13:25 PM

Louis Haußknecht

According to github.com/.../Program.cs he's using relative small User-objects (Id, Email, Name).

Anyway, nice tweak!

08 Sep 2010
16:53 PM

josh

Cool. I'm probably using a version before this was added and it was already fast. Faster than SQL and MongoDB in my simplistic tests. ..and I mean a LOT!! faster than both.

08 Sep 2010
22:03 PM

Agarwal / Simon Labrecque

[This assumes the code at github.com/.../Program.cs is the code that has been run for those benchmarks]

Ayende, any reason you're calling SaveChanges once when batch == 128, then wait until after the 5,163th object has been processed to call SaveChanges again? Eg, you are not resetting batch to 0 in the if ( github.com/.../Program.cs#L560).

08 Sep 2010
22:59 PM

Demis Bellot

For anyone interested I have modified benchmarks to include timings for Redis as well. I've kept it as close as possible to the RavenDB example including the 128 batch size which Redis doesn't need.

Basically the results shows that Redis stores all 5,163 documents in 981ms making it 2.85x quicker than RavenDB in this scenario.

I have more information available on my blog post here:

http://www.servicestack.net/mythz_blog/?p=474

Although Redis and RavenDB are not exactly the same type of NoSQL data store (RavenDB is a document database while Redis is a data structures server) they still have some overlapping use cases.

09 Sep 2010
10:14 AM

Ryan Heath

Dennis, you seem to have copyed over the bug Simon Labrecque is talking about.

Does it make any difference when you reset the batch counter when 128 is reached?

// Ryan

09 Sep 2010
10:52 AM

Ayende Rahien

Simon,

That is a bug, it should be batchSize % 128 == 0

09 Sep 2010
10:53 AM

Ayende Rahien

Demis,

Just to point out, Redis writes to memory, RavenDB writes to disk

09 Sep 2010
11:05 AM

Demis Bellot

Ok so there seems to be some confusion how Redis works, so I'll just copy a paragraph from my blog explaining it in more detail:

http://www.servicestack.net/mythz_blog/?p=474

Why is Redis so fast?

Based on the comments below there appears to be some confusion as to what Redis is and how it works. Redis is high-performance a data structures server written in C that operates predominantly in-memory and routinely persists to disk and maintains an Append-only transaction log file for integrity – both of which are configurable and can be made to write to disk on every operation.

For redundancy it includes built-in replication where you can turn any redis instance into a slave of another, which can be configured at runtime. It also features its own Virtual Machine implementation so if your dataset exceeds your available memory, un-frequented values are swapped out to disk whilst the hot values remain in memory.

Like other high-performance network servers e.g. Nginx, Node.js, etc it achieves maximum efficiency by having each Redis instance is a single process where all IO is asynchronous and no time is wasted context-switching between threads.

It achieves concurrency is by being really fast and achieves integrity by having all operations atomic. You are not just limited to the available transactions either as you can compose any combination of Redis commands together and process them atomically in a single transaction.

09 Sep 2010
11:09 AM

Ayende Rahien

Demis,

Did you configure your Redis server to write to disk on every operation (to match more closely what RavenDB is doing)?

09 Sep 2010
11:13 AM

Demis Bellot

The benchmarks are both using the standard configuration for both servers, so no.

I will re-run the benchmarks with the bug fix and configure it to write on every operation when I get home tonight.

10 Sep 2010
01:15 AM

Demis Bellot

Okay new benchmarks are in - details in my blog under the heading: Benchmarks – Take 2

http://www.servicestack.net/mythz_blog/?p=474

As any additional overhead is multiplied when the 'fsync' option is on, I removed some of these overheads imposed on the Redis Client i.e. active entity id tracking and batching (as its not required for Redis) before enabling the appendonly transaction log with ‘fsync always’ option.

Note: I’m using Redis's batch-ful MSET operation behind the scenes, so the fsync penalty is only paid once.

The new benchmarks show Redis is now 11.75x faster than RavenDB with this configuration.

If you disable the append only transaction log Redis becomes 16.9x faster than RavenDB.

Not saying performance is the most important metric just wanted to show that Redis provides a high-performance NoSQL solution for .NET clients. Multiple choices benefit everyone.

Demis

10 Sep 2010
17:05 PM

Chance

Nice! I'd love to have accidents like this!

Side question:

Any idea when you guys are going to implement geocoding support at the core of Raven? I thought about hacking it in myself, but at the rate of change right now I figured that would be a bad idea. Alternatively, I could perform the algos outside in our logic but I'd rather they be native. (Map/Reduce seems like our best bet atm).

Thanks,

Chance

10 Sep 2010
22:17 PM

Ayende Rahien

Chance,

RavenDB already support spatial queries. I need to document it, though

17 Sep 2010
14:10 PM

Chance

Ah! I can't believe I missed the email alert for your comment Ayende. That's awesome man, thanks!

By the way, its still on your Todo list. If you've finished that, I can only imagine what else you've knocked off of that list. You guys are rocking hard on Raven - keep it up!

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB