NHibernate – Beware of inadvisably applied caching strategies
One of the usual approaches for performance problems with most applications is to just throw caching on the problem until it goes away. NHibernate supports a very sophisticated caching mechanism, but, by default, it is disabled. Not only that, but there are multiple levels of opt ins that you have to explicitly state before you can benefit from caching.
Why is that?
The answer is quite simple, caching is an incredibly sensitive topic, involving such things as data freshness, target size, repetitive requests, etc. Each and every time I have seen caching used as a hammer, it ended up in tears, with a lot of micro management of the cache and quite a bit of frustration.
I wanted to give you an example, using the simple Blog->>Posts model, what happens if I wanted to display the blog and its posts? The code could look like this:
using (var session = sessionFactory.OpenSession()) using (var tx = session.BeginTransaction()) { var blog = session.Get<Blog>(2); foreach (var post in blog.Posts) { Console.WriteLine(post.Title); } tx.Commit(); }
And the mapping are:
<class name="Blog" table="Blogs"> <cache usage="read-write"/> <id name="Id"> <generator class="identity"/> </id> <property name="Title"/> <property name="Subtitle" /> <property name="AllowsComments" /> <property name="CreatedAt" /> <bag name="Posts" table="Posts" inverse="true"> <cache usage="read-write"/> <key column="BlogId"/> <one-to-many class="Post"/> </bag> </class> <class name="Post" table="Posts"> <id name="Id"> <generator class="identity"/> </id> <property name="Title" /> <many-to-one name="Blog" column="BlogId"/> </class>
Are you seeing the horrible issue in here? You probably don’t see this, but you will see in a moment. Let us see what is going to happen in the first run of this code:
That is about as well as you can make it. But what about the second time?
Ouch!
What just happened?!
Well, we loaded the blog from the cache, and then we loaded the Blogs’s Post collection from the cache. So far, it is working really nicely for us. However, the next thing we see, we have a huge SELECT N+1 and we have a lot more queries in the cache scenario than in the non cache scenario.
The problem is that when we cache a collection, we aren’t caching the data in that collection. We are only caching the ids that means that NHibernate gets the collection of ids and then try to resolve them one by one. Remember that I said that the mapping above has a horrible problem? While the Posts collection is cached, the Post themselves are not, requiring NHibernate to go to the database for each an every one of them.
Have I said ouch already? Be careful what you cache, and make sure that you aren't doing caching in a way that will actively harm you.
The same is applicable for the query cache as well, if you have a cached query that loaded entities, you want to make sure that the entities are also cached.
Comments
I haven't tried your app yet since it's not Hibernate compatible, but if it analyzes these types of situations and warns us mere mortals of the traps then "NH Profiler is your friend" is what the people will say
The only time I'm using second level cache for entities and collections that rarely/never change like customer types or countries. Caching queries or collections of entities that change frequently is likely to cause more issues than it solves.
Is there a cache in NHibernate that works with MS SQL notifications?
Cali,
NH Prof IS Hibernate Compatible.
Would you like to take part in the private beta?
Dmitry,
You would be surprised how much a 5 min cache can help to app perf.
And yes, there is such a thing, SysCache2 can understand SqlDependencies.
Sign me up
Thanks, Ayende, I finally understood how Get/Load works and its effects on the second level cache.
But if I'm using ActiveRecord, and I don't usually have access to the Session, how do I load entities with ActiveRecord so that they'll go into the second level cache ?
Also, how do I put a collection of entities into the second cache all at once, without having select n+1 ?
Miki,
FindByPrimaryKey translate to that, and calling it with true or false will render the appropriate get/load call.
And you just load the collection
Ah, excellent, thanks.
Though, what if I don't know the IDs of what I want and i need to query for them? Would I call FindByPrimaryKey on each one of them afterwards so it'll store the id ?
So in the example you outlined, what would be the solution?
But you didn't put the <cache in Post mapping. Now the question is. To solve this problem is better enable cache in Post or turn off the cache?
Shame on me. Working with NHibernate for at least 3 years I should know the answer but of course that a simple test can show it to us.
But without any test, I think that enabling cache on Post will solve the Select N+1 problem.
So how do we cache data instead of ids with NH ?
In theory I guess it would be possible for NHibernate to make an informed decision whether to load from cache or not.
iocer,
You need to cache the entities as well
Cassio,
That depends on your scenario, I can't really tell.
Enabling cache on Post would be easiest, but is it cachable? For how long? If it drops out of the cache, you are back to the same problem.
That is why this is sensitive to context.
Michael,
Either don't use caching, or cache Post.
See my reply to Cassio for the reasoning behind that decision
NHibernate in Action (pg156) says:
WTF?
The session cache is not the same thing as this (2nd level cache)
RichB,
What I am talking about here is the 2nd level cache
Ooops. Sorry. I'm still learning.
Is there a way to tell NHibernate to cache an entire table the first time any entity is loaded from it? I often use a series of small lookup tables which my entities reference and it would make sense to load the entire lookup table rather than query the DB for each value
James,
Yes
Comment preview