NHibernate – Beware of inadvisably applied caching strategies

time to read 5 min | 960 words

One of the usual approaches for performance problems with most applications is to just throw caching on the problem until it goes away. NHibernate supports a very sophisticated caching mechanism, but, by default, it is disabled. Not only that, but there are multiple levels of opt ins that you have to explicitly state before you can benefit from caching.

Why is that?

The answer is quite simple, caching is an incredibly sensitive topic, involving such things as data freshness, target size, repetitive requests, etc. Each and every time I have seen caching used as a hammer, it ended up in tears, with a lot of micro management of the cache and quite a bit of frustration.

I wanted to give you an example, using the simple Blog->>Posts model, what happens if I wanted to display the blog and its posts? The code could look like this:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
    var blog = session.Get<Blog>(2);
    foreach (var post in blog.Posts)
    {
        Console.WriteLine(post.Title);
    }
    tx.Commit();
}

And the mapping are:

<class name="Blog"
         table="Blogs">
    <cache usage="read-write"/>
    <id name="Id">
        <generator class="identity"/>
    </id>
    <property name="Title"/>
    <property name="Subtitle" />
    <property name="AllowsComments" />
    <property name="CreatedAt" />
    <bag name="Posts" table="Posts" inverse="true">
        <cache usage="read-write"/>
        <key column="BlogId"/>
        <one-to-many class="Post"/>
    </bag>
</class>

<class name="Post"
             table="Posts">
    <id name="Id">
        <generator class="identity"/>
    </id>
    <property name="Title" />
    <many-to-one name="Blog"
                             column="BlogId"/>
</class>

Are you seeing the horrible issue in here? You probably don’t see this, but you will see in a moment. Let us see what is going to happen in the first run of this code:

image

That is about as well as you can make it. But what about the second time?

image

Ouch!

What just happened?!

Well, we loaded the blog from the cache, and then we loaded the Blogs’s Post collection from the cache. So far, it is working really nicely for us. However, the next thing we see, we have a huge SELECT N+1 and we have a lot more queries in the cache scenario than in the non cache scenario.

The problem is that when we cache a collection, we aren’t caching the data in that collection. We are only caching the ids that means that NHibernate gets the collection of ids and then try to resolve them one by one. Remember that I said that the mapping above has a horrible problem? While the Posts collection is cached, the Post themselves are not, requiring NHibernate to go to the database for each an every one of them.

Have I said ouch already? Be careful what you cache, and make sure that you aren't doing caching in a way that will actively harm you.

The same is applicable for the query cache as well, if you have a cached query that loaded entities, you want to make sure that the entities are also cached.