Finding the “best” book scenario
This started out as a customer engagement, but it was interesting to see how we solved it.
The problem is searching for books. Let us take the following books as good example:
We have users that want to have recommendations for books in specific topics, and authors can pay us to promote their books. You can see how it looks like above.
Now, the rules we want to follow for sorting the results are fairly simple. Find all the matching books, and sort them so:
- The user has searched for a book primary tag, and the author paid to promote that tag, show first.
- The user has searched for a book secondary tag, and the author paid to promote that tag, show second.
- The user has searched for a book primary tag, and the author didn’t paid to promote that tag, show third.
- The user has searched for a book secondary tag, and the author didn’t paid to promote that tag, show forth.
Actually trying to specify the sort order according to this tend to be quite hard to do, as it turns out, but we can take advantage of boosting to get what we want.
We define the following index:
from book in docs.Books select new { PaidPrimaryTag = book.Tags.Where(x=>x.Primary && x.Paid).Select(x=>x.Name), PaidSecondaryTag = book.Tags.Where(x=>x.Primary == false && x.Paid).Select(x=>x.Name), PrimaryTag = book.Tags.Where(x=>x.Primary).Select(x=>x.Name), SecondaryTag = book.Tags.Where(x=>x.Primary == false).Select(x=>x.Name), }
And now we want to do a few searches: First for NoSQL and then RavenDB.
The actual query we issue is:
And as you can see, books/3 is shown first, because the author paid for higher ranking. What about when we do that with RavenDB?
We have books/3, as before, but books/2 is higher ranked than books/1. Why is that? Because books/2 paid to have a higher ranking on a secondary tag, and it is more important than even a primary tag match according to our query.
This is quite elegant, and it also allows us to take into account relevancy in the search as well.
Comments
If you have enough data you can go even further and apply collaborative filtering to make a recommendation specifically tailored to the user. The approach is not elegant but it as powerful as what Amazon is using to support recommendations.
Is (PaidPrimaryTag: RavenDB) lucent.net search? Does it return a bool value or the number of matches or anything else?
Stephen, No, the query is a full text query, so the query is going to find something that match that OR the rest.
Comment preview