Complex Searching

time to read 6 min | 1089 words

I was asked to compare searching using O/RM tool and using a Stored Procedures. I am biased, of course, but I like to think that I biased with a reason. Anyway, Here is the UI for the search functionality that we would like to implement (don't bother to look too hard for the source of this dialog, I just used the default XP Search for files one).

ֲ 

 

We can search by any combination of the above. This is a fairly simple dialog, in my opinion, for a complex one, see here (JIRA's search). For this post, I'm not going to bring up searching by business rules, since that is a different can of worms.

To simplify the above UI, I think we can say that we need to search by any or all of the following:

  • Name
  • Content
  • Modification before
  • Modification after
  • Maximum size

 

Here is my first implementation of the procedure:

CREATE PROCEDURE SearchFileListing

ֲ ֲ ֲ ֲ ֲ  @name as nvarchar(255),

ֲ ֲ ֲ ֲ ֲ  @content as nvarchar(max),

ֲ ֲ ֲ ֲ ֲ  @modifiedAfter as datetime,

ֲ ֲ ֲ ֲ ֲ  @modifiedBefore as datetime,

ֲ ֲ ֲ ֲ ֲ  @maximumSize as int

AS

ֲ ֲ ֲ ֲ ֲ  SELECT name, content, modifiedDate, size

ֲ ֲ ֲ ֲ ֲ  FROM FileListing files

ֲ ֲ ֲ ֲ ֲ  WHERE

ֲ ֲ ֲ ֲ ֲ  ֲ ֲ ֲ  files.name = isnull(@name,files.name)

ֲ ֲ ֲ ֲ ֲ  AND files.content like '%' + isnull(@content, files.content) + '%'

ֲ ֲ ֲ ֲ ֲ  AND files.modifiedDate >= isnull(@modifiedAfter, files.modifiedDate)

ֲ ֲ ֲ ֲ ֲ  AND files.modifiedDate <= isnull(@modifiedBefore, files.modifiedDate)

ֲ ֲ ֲ ֲ ֲ  AND files.size >= isnull(@maximumSize, files.size)

 

While I have not tested it, I'm pretty sure that the performance of this stored procedure is not ideal. For one thing, it will always search all the columns, which mean that if there is an index available for the query (for instance, if we are searching by filename only), it wouldn't be used. This is the easiest SP to write (although apparently not very obvious).

To be clear, the above procedure always does a table scan (or a clustered index scan, if it has a clustered index), even if there are indexes that can be used to help this particular query. For instance, if we have an index on the size column, it will not be used with the above procedure.

We can re-write this procedure with IF statements to execute a different stored procedure per each possibility, but the number of possibilities Is too large to write a stored procedure for every possibility (which is the recommended approach). And this is for a relatively case. In most cases, using the SP approach, I would either suffer the performance degradation with a procedure like the one above, create a few common search paths (for instance, not including the Content column) or turn to sp ExecuteSql procedure (and the problems it entails).

 

In comparison, check out the code to handle searching using NHibernate:

 

ICriteria query = session.CreateCriteria(typeof(File));

if (nameTxtBx.Text != "")

ֲ ֲ ֲ  query.Add(Expression.Eq("File", nameTxtBx.Text ));

if (contentTxtBx.Text != "")

ֲ ֲ ֲ  query.Add(Expression.Like("Content", "%"+contentTxtBox.Text+"%"));

if (modifiedSelect.HasSelection)

ֲ ֲ ֲ  query.Add(Expression.Between("Modified", modifiedSelect.From, modifiedSelect.To));

if (sizeSelect.HasSelection)

ֲ ֲ ֲ  query.Add(Expression.Le("Size", sizeSelect.Value));

return query.List();

 

As a note, this is not only easier to read, it will also produce queries that will perform much better than the SP approach, because they will search for exactly the right stuff, and allow SQL Server to use the indexes on the table without going to the trouble of writing every possibility (which has its own set of problem).