The design of RavenDB 4.0: Making Lucene reliable

architecture (623) rss
bugs (451) rss
community (382) rss
databases (481) rss
design (899) rss
development (654) rss
hibernating-practices (73) rss
miscellaneous (592) rss
performance (397) rss
programming (1104) rss
raven (1471) rss
ravendb.net (558) rss
reviews (184) rss

2025
- October (4)
- September (10)
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

May 03 2016

The design of RavenDB 4.0Making Lucene reliable

time to read 4 min | 647 words

I don’t like Lucene. It is an external dependency that works in somewhat funny ways, and the version we use is a relatively old one that has been mostly ported as-is from Java. This leads to some design decisions that are questionable (for example, using exceptions for control flow in parsing queries), or just awkward (by default, an error in merging segments will kill your entire process). Getting Lucene to run properly in production takes quite a bit of work and effort. So I don’t like Lucene.

We have spiked various alternatives to Lucene multiple times, but it is a hard problem, and most solutions that we look at lead toward pretty much the same approach that Lucene does it.By now, we have been working with Lucene for over eight years, so we have gotten good in managing it, but there are still quite a bit of code in RavenDB that is decided to managing Lucene’s state, figuring out how to recover in case of errors, etc.

Just off the top of my head, we have code to recover from aborted indexing, background processes that takes regular backups of the indexes, so we’ll be able to restore them in the case of an error, etc. At some point we had a lab of machines that were dedicated to testing that our code was able to manage Lucene properly in the presence of hard resets. We got it working, eventually, but it was hard. And we still get issues from users that into trouble because Lucene can tie itself into knots (for example, a disk full error midway through indexing can corrupt your index and require us to reset it). And that is leaving aside the joy of I/O re-ordering does to you when you need to ensure reliability.

So the problem isn’t with Lucene itself, the problem is that it isn’t reliable. That led us to the Lucene persistence format. While Lucene persistent mode is technically pluggable, in practice, this isn’t really possible. The file format and the way it works are very closely tied to the idea of files. Actually, the idea of process data as a stream of bytes. At some point, we thought that it would be good to implement a Transactional NTFS Lucene Directory, but that idea isn’t really viable, since that is going away.

It was at this point that we realized that we were barking at the entirely wrong tree. We already have the technology in place to make Lucene reliable: Voron!

Voron is a low level storage engine that offers ACID transactions. All we need to do is develop VoronLuceneDirectory, and that should handle the reliability part of the equation. There are a couple of details that needs to be handled, in particular, Voron needs to know, upfront, how much data you want to write, and a single value in Voron is limited to 2GB. But that is fairly easily done. We write to a temporary file from Lucene, until it tells us to commit. At which point we can write it to Voron directly (potentially breaking it to multiple values if needed).

Voila, we have got ourselves a reliable mechanism for storing Lucene’s data. And we can do all of that in a single atomic transaction.

When reading the data, we can skip all of the hard work and file I/O and serve it directly from Voron’s memory map. And having everything inside a single Voron file means that we can skip doing things like the compound file format Lucene is using, and chose a more optimal approach.

And with a reliable way to handle indexing, quite large swaths of code can just go away. We can now safely assume that indexes are consistent, so we don’t need to have a lot of checks on that, startup verifications, recovery modes, online backups, etc.

Improvement by omission indeed.

Tweet Share Share 16 comments

Tags:

Comments

03 May 2016
09:16 AM

Pop Catalin

Lucene.Net is a fairly complex piece of software but not overly complex. Why not build something custom for Raven? Using the same approach as for Voron.

03 May 2016
09:40 AM

Paul Stovell

When Octopus used Raven, Lucene was at the centre of most of our production issues too. Raven has features to cover Lucene's warts - we had to build features into our own product to cover those!

I really think you should go all the way, and build indexing yourself. As a database company, indexing should be one of your core competencies, it makes so much sense to really invest in building that yourself.

I even think you would have been better to stick with ESENT (only caused us a few issues) and concentrate on removing Lucene instead of switching to Voron given the choice.

Paul

03 May 2016
10:11 AM

Oren Eini

Pop Catalin, We looked into what this would take (see the posts about Corax). But it is a very big field, and pretty complex. We decided to hold off on this for now, just fix what was the worst offender and move on. Maybe we'll be able to get to it on the 5.0 release.

03 May 2016
10:17 AM

Oren Eini

Paul, A large problem is related to the type of machine you are running on. Commodity hardware sucks in many cases, and you can't rely on what the hardware will tell you.

We solved most of those problems with Voron, so just by putting Lucene on that we gain a much better safety guarantees. I agree that indexing is something that we want to own, but that isn't very simple. Along with the other changes we do/intend to do in 4.0, there just isn't enough space to also replace Lucene. Esent is problematic of other reasons (doesn't run on Linux, we don't control it and it blocks a lot of opportunities like the blittable format optimizations).

03 May 2016
13:52 PM

Uri

there is no need to re-invent the wheel. Lucene is an awesome software peace, I would only port newer version and optimize it

03 May 2016
16:25 PM

Bruno Lopes

One advantage to using Lucene is that anyone with previous experience has a head start, and that knowledge is usable across a lot of other stacks (elasticsearch, solr, direct lucene). Removing the warts would be great, but keeping the "api" would be better, IMO :)

03 May 2016
19:39 PM

Oren Eini

Bruno Lopes, Yes, if/when this happens, we are going to have to maintain a lot of backward compatability

03 May 2016
22:13 PM

Lucas Trzesniewski

I wonder: have you tried using Lucene (the Java version) through IKVM? I used this approach, since I needed some features which weren't available in Lucene.NET, which is 3 major versions late and didn't get a new release for over 3 years. I suppose you could run into performance issues though, but I'm curious if this option has been considered/profiled.

04 May 2016
06:18 AM

Oren Eini

Lucas, That isn't something that would be viable for us, no. We need to be able to properly support it, and adding IKVM is just too complex for our operational requirements. We are going to be focusing on helping the next version of Lucene once we free up the capacity to do so

04 May 2016
06:35 AM

Jesús López

RavenDB queries are BASE because you build your indexes asynchronously. Why not to add RDMS-like synchronous indexes to RanvenDB ? I mean B+ trees for range queries. You already have B+ trees in Voron so implementing them would not be a big deal. So, certain critical queries would never return stale info, and index based updates would be reliable. This could be your first step toward Lucene independence.

04 May 2016
06:45 AM

Oren Eini

Jesus, Sync indexes or not doesn't actually matter for the implementation. We could implement sync indexes now with Lucene. And B+Trees indexes have sever limitation (you can only do queries on the specified key in the order specified. So, OrderBy FName, LName works with the index, but not using OrderBy LName, FName. Lucene handles that much more nicely.

And we considered ACID indexes, but the problem is that it would bring all the usual pain of those, and result in "all my indexes are ACID because of course I need to do everything ACID".

We added support for waiting when doing the queries for a reason

04 May 2016
15:40 PM

Jesús López

Yes, of course. But you most queries don't need all that fancy things Lucene does. B+ tree indexes are far simpler that Lucene indexes, lot easier to implement a much more efficient.

04 May 2016
15:57 PM

Oren Eini

Jesus, Sure, they are easier to implement, in fact, we already have them (in 4.0 we moved things like Raven/DocumentsByEntityName to this), but changing something as fundamental as sorting is not something that can easily be done. Leaving aside the fact that we still have to resolve the issue of ACID indexes and what that would do to the kind of optimizations that we can give by not doing them.

09 May 2016
14:12 PM

Franck Esteve

Why not helping the guys that develop lucene.net instead of fixing what's wrong on your side?

09 May 2016
14:17 PM

Oren Eini

Franck, The issue with Lucene are fundamental to the way they are working. It isn't something that can be fixed short of complete re-writing of the code. To make things easy, consider the fact that all the I/O in Lucene is buffered, and trying to go to unbuffered I/O (which would allow safety / ACID) has extremely high cost for the kind of usage Lucene is doing

09 May 2016
14:23 PM

Franck Esteve

Ok, I see what's wrong, thanks for this short and clear answer!

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

The design of RavenDB 4.0Making Lucene reliable

More posts in "The design of RavenDB 4.0" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "The design of RavenDB 4.0" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication