RavenDB new feature: Highlights
Before anything else, I need to thank Sergey Shumov for this feature. This is one of the features that we got as a pull request, and we were very happy to accept it.
What are highlights? Highlights are important when you want to give the user better search UX.
For example, let us take the Google Code data set and write the following index for it:\
public class Projects_Search : AbstractIndexCreationTask<Project, Projects_Search.Result> { public class Result { public string Query { get; set; } } public Projects_Search() { Map = projects => from p in projects select new { Query = new[] { p.Name, p.Summary } }; Store(x => x.Query, FieldStorage.Yes); Index(x=>x.Query, FieldIndexing.Analyzed); } }
And now, we are going to search it:
using(var session = store.OpenSession()) { var prjs = session.Query<Projects_Search.Result, Projects_Search>() .Search(x => x.Query, q) .Take(5) .OfType<Project>() .ToList(); var sb = new StringBuilder().AppendLine("<ul>"); foreach (var project in prjs) { sb.AppendFormat("<li>{0} - {1}</li>", project.Name, project.Summary).AppendLine(); } var s = sb .AppendLine("</ul>") .ToString(); }
The value of q is: source
Using this, we get the following results:
- hl2sb-src - Source code to Half-Life 2: Sandbox - A free and open-source sandbox Source engine modification.
- mobilebughunter - BugHunter Platfrom is am open source platform that integrates with BugHunter Platform is am open source platform that integrates with Mantis Open Source Bug Tracking System. The platform allows anyone to take part in the test phase of mobile software proj
- starship-troopers-source - Starship Troopers: Source is an open source Half-Life 2 Modification.
- static-source-analyzer - A Java static source analyzer which recursively scans folders to analyze a project's source code
- source-osa - Open Source Admin - A Source Engine Administration Plugin
And this make sense, and it is pretty easy to work with. Except that it would be much nicer if we could go further than this, and let the user know why we selecting those results. Here is were highlights come into play. We will start with the actual output first, because it is more impressing:
- hl2sb-src - Source code to Half-Life 2: Sandbox - A free and open-source sandbox Source engine modification.
- mobilebughunter - open source platform that integrates with BugHunter Platform is am open source platform that integrates with Mantis Open Source
- volumetrie - code source - Volumetrie est un programme permettant de récupérer des informations sur un code source - Volumetrie is a p
- acoustic-localization-robot - s the source sound and uses a lego mindstorm NXT and motors to point a laser at the source.
- bamboo-invoice-ce - The source-controlled Community Edition of Derek Allard's open source "Bamboo Invoice" project
And here is the code to make this happen:
using(var session = store.OpenSession()) { var prjs = session.Query<Projects_Search.Result, Projects_Search>() .Customize(x=>x.Highlight("Query", 128, 1, "Results")) .Search(x => x.Query, q) .Take(5) .OfType<Project>() .Select(x=> new { x.Name, Results = (string[])null }) .ToList(); var sb = new StringBuilder().AppendLine("<ul>"); foreach (var project in prjs) { sb.AppendFormat("<li>{0} - {1}</li>", project.Name, string.Join(" || ", project.Results)).AppendLine(); } var s = sb .AppendLine("</ul>") .ToString(); }
For that matter, here is me playing with things, searching for: lego mindstorm
- acoustic-localization-robot - ses a lego mindstorm NXT and motors to point a laser at the source.
- dpm-group-3-fall-2011 - Lego Mindstorm Final Project
- hivemind-nxt - k for Lego Mindstorm NXT Robots
- gsi-lego - with Lego Mindstorm using LeJos
- lego-xjoysticktutorial - l you Lego Mindstorm NXT robot with a joystick
You can play around with how it highlight the text, but as you can see, I am pretty happy with this new feature.
 

Comments
Very helpful feature! What is OfType, is that the same as AsProjection?
Chanan, OfTyoe is the same as As.
To play devil's advocate, why wouldn't you want to do this on the clientside with a JavaScript library?
It seems like you'd want to do more with the UI like tooltip hovers.
I still think it is cool and I'm not gonna complain about new features.
Khalid, Let us assume that you are indexing a text field that is 2 KB in size. You don't want to send that 2KB times NumberOfDocs. This does this on the server side, and sent you only the snapshots.
My apologies, I was unclear. I didn't mean by Ajax (request to the server). I meant there are libraries that will scrape your page clientside and put the highlighting in after the page has loaded. See the example below.
http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html
Certainly the way you described it would be a inefficient way to do it.
Khalid, We aren't talking about doing this on a single document ,this is for you to be able to see the full search results with more data.
Khalid, highlighting isn't only about putting html tags into a text. It also allows you to fetch snippets (short regions of text with matched tokens inside) and do it fast because all the necessary information (tokens offsets) is already contained in Lucene index.
Khalid, if you look at the highlighted results, it doesn't look like it's displaying the entire project.Summary value. It's only showing the snippet of text that includes the highlighted term. That's pretty nice.
What build should I look for this in?
Matt, Any in the last week or so.
So this is more a server / administration ui feature?
I guess I am just confused as to how this actually works.
Ciel, This is a user facing feature. You would use it for your own search pages to give better UX
Comment preview