Finding the “best” book scenario

time to read 3 min | 486 words

This started out as a customer engagement, but it was interesting to see how we solved it.

The problem is searching for books. Let us take the following books as good example:

image

We have users that want to have recommendations for books in specific topics, and authors can pay us to promote their books. You can see how it looks like above.

Now, the rules we want to follow for sorting the results are fairly simple. Find all the matching books, and sort them so:

  • The user has searched for a book primary tag, and the author paid to promote that tag, show first.
  • The user has searched for a book secondary tag, and the author paid to promote that tag, show second.
  • The user has searched for a book primary tag, and the author didn’t paid to promote that tag, show third.
  • The user has searched for a book secondary tag, and the author didn’t paid to promote that tag, show forth.

Actually trying to specify the sort order according to this tend to be quite hard to do, as it turns out, but we can take advantage of boosting to get what we want.

We define the following index:

from book in docs.Books
select new
{
  PaidPrimaryTag = book.Tags.Where(x=>x.Primary && x.Paid).Select(x=>x.Name),
  PaidSecondaryTag = book.Tags.Where(x=>x.Primary == false && x.Paid).Select(x=>x.Name),
  PrimaryTag = book.Tags.Where(x=>x.Primary).Select(x=>x.Name),
  SecondaryTag = book.Tags.Where(x=>x.Primary == false).Select(x=>x.Name),
}

And now we want to do a few searches: First for NoSQL and then RavenDB.

The actual query we issue is:

image

And as you can see, books/3 is shown first, because the author paid for higher ranking. What about when we do that with RavenDB?

image

We have books/3, as before, but books/2 is higher ranked than books/1. Why is that? Because books/2 paid to have a higher ranking on a secondary tag, and it is more important than even a primary tag match according to our query.

This is quite elegant, and it also allows us to take into account relevancy in the search as well.