Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,567
|
Comments: 51,184
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 572 words

I wrote it because of a particular problem that I have run into, which is not something that I have heard much discussion about. In my application, I need to query the database about a certain data, and the best way to do that would be using an in query. For the purpose of discussion, I want to find all the customers associated with the current user.

Unfortantely, I can't do something as simple as "Where.Customer.User == CurrentUser", because a customer may be associated to a user in many complex and interesting ways (the end result is a 3 pages query, btw). Therefor, I make the calculation of who are the relevant customers for a user when they login, and cache that.

So, I need to ask the database for all the customers assoicated with the user, and since I already know about the user's customers. In() was a natrual way to go. The problem is that I actually began to run into SQL Server 2,100 parameters per query limit when important users (who has a lot of associations) started to use that.

Can you say, major stumbling block? There are several solutions for that, and the one I choose is to extend NHibernate to perform an IN query on an XML datatype, as described here. You can see the implementation here.

Why use an IN on XML instead of a join against the data? I can send it to the database using BulkCopy and then join against that very easily, no? (I describe one such way here)

Using Bulk Copy & Join approach would probably turn out to be faster than an IN on an XPath (haven't tested, though). But as it turn out, I had several reasons for that:

  • Using the BulkCopy & Join approach would mean that I need to perform two DB Queries, instead of one.
  • Someone should be responsible for clearing the joined table at one point, that is another thing to deal with.
  • It requires a two steps process, with no easy way to back out of that if there is a small amount of items that we want to check.
  • Caching

Using the BulkCopy & Join basically means that I have no real way to avoid hitting the database altogether. Using this approach, I am merely adding a (potentially large) parameter to the query, and let NHibernate deals with everything else.

The way you use this is simple:

session.CreateCriteria(typeof(Customer))
    .Add(XmlIn.Create("id", potentiallyLargeAmount))
    List();

It will automatically default to normal IN behavior if you are not running on SqlServer 2005 or if the amount of items that you are checking is smaller than 100.

time to read 3 min | 524 words

This is somewhat of a specific scenario, but let us assume that you have an application where you want to specialize the services of the applications by the current user. If the user belongs to the Northwind customer, you want to have one behavior, and if it belongs to the Southsand customer, you want to have a different behavior. All users from all others customers get the default behavior.

To make it simple, let us talk about NHibernate configuration. You have a default schema that you use for most customers, and you specialize that for those customers that wants extra. This means that you need to keep a session factory per customer, because you have different schema that in the default one (changing the connection string is not enough).

To be clear, this is not about entity inheritance, this is about specialization of the entire application, which I just happened to demonstrate via NH configuration.

Now, Windsor supports this ability by having a parent container and child containers, but Binsor didn't expose this functionality easily, I did some work on it today, and the end result is that you can configure it like this:

We have the global container (implicit to Binsor, since we created it before we run the Binsor script), then we run over the configuration files and create a container per each file. We register them in the ContainerSelector, which is an application level service (below).

import HierarchicalContainers
import System.IO

Component("nhibernate_unit_of_work", IUnitOfWorkFactory, NHibernateUnitOfWorkFactory,
	configurationFileName: """..\..\hibernate.cfg.xml""")
	
Component("nhibernate_repository", IRepository, NHRepository)
Component("container_selector", ContainerSelector)

for configFile in Directory.GetFiles("""..\..\Configurations""", "*.cfg.xml"):
	continue if Path.GetFileName(configFile) == "hibernate.cfg.xml"
	print "Build child configuration for ${configFile}"
	child = RhinoContainer(IoC.Container)
	using IoC.UseLocalContainer(child):
		Component("nhibernate_unit_of_work", IUnitOfWorkFactory, NHibernateUnitOfWorkFactory,
			configurationFileName: configFile)
		Component("nhibernate_repository", IRepository, NHRepository)
	#need to remove both .cfg and .xml
	containerName = Path.GetFileNameWithoutExtension(Path.GetFileNameWithoutExtension(configFile))
	IoC.Container.Resolve(ContainerSelector).Register(containerName, child)

You can use it like this, and enter/leave the context of the a client at will:

RhinoContainer container = new RhinoContainer("Windsor.boo");
IoC.Initialize(container);
ContainerSelector containerSelector = IoC.Resolve<ContainerSelector>();
containerSelector.PrintChildContainers();
using(UnitOfWork.Start())
{
    Console.WriteLine(
        NHibernateUnitOfWorkFactory.CurrentNHibernateSession
            .Connection.ConnectionString
        );
}
using(containerSelector.Enter("Northwind"))
{
    using (UnitOfWork.Start())
    {
        Console.WriteLine(
            NHibernateUnitOfWorkFactory.CurrentNHibernateSession
                .Connection.ConnectionString
            );
    }
}
using (containerSelector.Enter("Southsand"))
{
    using (UnitOfWork.Start())
    {
        Console.WriteLine(
            NHibernateUnitOfWorkFactory.CurrentNHibernateSession
                .Connection.ConnectionString
            );
    }
}

Because this is a fairly complex topic, I have created a simple reference implementation that you can get here:

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/trunk/SampleApplications/HierarchicalContainers

time to read 1 min | 161 words

I just made a small change to the EnsureMaxNumberOfQueriesPerRequestModule, when it detects that the amount of queries performed goes beyond the specified value, it would also include the queries that it detected in the exception message. Very minor change, but the affect is that I can just scroll the page and say: "Oh, I have a SELECT N+1 here", directly off the exception page.

On a side note, I am getting better at optimizing NHibernate based application, and I strongly suggest anyone using NHibernate to look at Multi Query in 1.2 (and Multi Criteria on the trunk) for those kind of things. It give you quite a bit of power.

I run into several places today that we would read & write large amount of data on a single request, and that triggered the max query limit. It took a while, but some interesting usage of both batching & multi queries dropped the database roundtrips by an order of magnitude. Nice.

time to read 2 min | 334 words

Here is an interesting approach to get deep object graphs effectively.  This will ensure that you will get all the relevant collections without having to lazy load them and without a huge cartesian product. Especially useful if you want to load a collection of items with the associated deep object graph.

public Policy GetPolicyEagerly(int policyId)
{
	IList list = ActiveRecordUnitOfWorkFactory.CurrentSession.CreateMultiQuery()
		.Add(@"from Policy policy left join fetch policy.PolicyLeadAssociations
where policy.Id = :policyId") .Add(@"from Policy policy left join fetch policy.PolicyEmployeeAssociations
where policy.Id = :policyId") .Add(@"from Policy policy left join fetch policy.PolicyManagerAssociations
where policy.Id = :policyId
") .Add(@"from Policy policy left join fetch policy.PolicyDepartmentAssociations
where policy.Id = :policyId
") .Add(@"from Policy policy left join fetch policy.PolicyCustomerAssociations
where policy.Id = :policyId
") .SetEntity("policy", Policy) .List(); IList firstResultList = (IList) list[0]; if(firstResultList.Count==0) return null; return = (Policy) firstResultList [0]; }

The domain above is a fake one, by the way, don't try to make any sense of it.

Shocking Rob

time to read 1 min | 129 words

I am posting this mainly because I want to see how far I can shock Rob Conery

image

The exception is raised by the EnsureMaxNumberOfQueriesPerRequestModule, and it is currently set on the development level, for QA/Staging, I would probably reduce it further, although I have some pages where I

Oh, and to Rob, that was a classic error of doing query per node (instead of doing a single query) (added an eager load instead of a query and was done). I am doing some performance tuning right now, and all in all, it is very boring. Find a hot spot, consolidate data access, use MultiCriteria or MultiQuery, move on.

Imprisoning Mort

time to read 3 min | 509 words

Nick Malik responded to the discussion around his Tools for Mort post. He has a very unique point of view.

If you cannot make sure that Mort will write maintainable code, make him write less code.    Then when it comes time for you (not Mort) to maintain it (he can't), you don't.  You write it again.

Okay, so you have a tool that makes sure that Mort doesn't write a lot of code with it. Now Mort has left and I need to maintain the code. How do I do it? I can't do it with the tools that Mort has used, because there are intentionally crippled. You guessed it, time to rewrite, and it is not just a rewrite of Mort's code, it is a rewrite that would need to add functionality that existed in Mort's framework, but crippled so Mort wouldn't damage himself with it.

Sorry, that is the wrong approach to take.

Someone 'smart' has written the 'hard' stuff for Mort, and made it available as cross cutting concerns and framework code that he doesn't have to spend any time worrying about.  Mort's code is completely discardable.

I thought that Pie-In-The-Sky frameworks were already widely acknowledged as a Bad Thing.

Does Mort put process first or people first?  He puts people first, of course.  He writes the code that a customer wants and gets it to the customer right away.  The customer changes the requirements and Mort responds.  If it sounds like a quick iteration, that is because it is.

Agile doesn't mean iterations, agile means working software and enabling change. You says that Mort can respond quickly to changes in the application, but is only during the first few iterations, after that, Mort is too busy fighting with the code to be able to significantly add any value to the application.

Possible Answer: We can have Mort consume a service.  He can't change it.  He can't screw it up.  But he can still deliver value

I really don't have your faith in Mort's inability to screw things up. What do you do when Mort decide to "improve" performance by making the service calls in an endless loop on another thread?

After getting those points across, I would like to protest most strongly about the general tone of Nick's post. It is extremely derogatory toward Mort, and it preclude in advance the ability to improve. I am a Mort on quite a few levels (the entire WinFX stack comes to mind), does this mean that I am bound to write unmaintainable code and should be locked down to a very small set of "safe" choices, which were chosen for me by those Above Me ?

Sorry, I really can't accept this approach, and while it explains some of the stuff that Microsoft puts out, the only thing that it helps is to stifle innovation and development on the platform. If this is Nick's idea about how things should be, it is very sad. I seriously hope that this isn't the accepted position at Microsoft.

time to read 1 min | 169 words

This is a big biggie for me, because it enables a much nicer syntax for a lot of stuff. But first, let us show this:

using(new ExceptionDetector())
{
	if(new Random().Next(1,10)%2 == 0)
          throw new Exception();
}

How can you tell, from the ExceptionDetector, if an exception was thrown or not? Well, conventional wisdom, and what I thought about until 15 minutes ago, says that you can't. I want to thank Daniel Fortunov, for teaching me this trick:

public class ExceptionDetector : IDisposable
{
    public void Dispose()
    {
        if (Marshal.GetExceptionCode()==0)
            Console.WriteLine("Completed Successfully!");
        else
            Console.WriteLine("Exception!");
    }
}
Amazing!
time to read 8 min | 1474 words

Frans had a long comment to my last post, I started to reply in a comment, but it grew too big for that.

 Let's use ObjectBuilder from the entlib as an example. Anyone who hasn't read the code or its non-existing dev docs, go read the code and the unittests, then come back here and explain in detail how it works inside and proof you're right.

xJust to answer that, I have read the OB source, and I have never bothered to look at whatever documentation that exists for it.
After reading the code and the tests, I was able to extend OB to support generic inferencing capabilities. (Given IRepository<T> whose implementor is NHibernateRepository<T>, when asked for IRepository<Customer>, return NHibernateRepository<Customer>). That is an advanced feature of IoC, added to a container that I wasn't familiar with, without affecting the rest of the functionality.

Oh, and while I can probably give a short description of how OB  works, I am by no means an expert, nor can I really explain how OB works, but I was able to go in, understand the area that I wanted to modify, make a change that greatly benefit me, without breaking anything else. That is the value in maintainable code.

And this is for the people that thinks that I bash the P&P code. I made similar changes to ObjectBuilder and to Windsor, at around the same time frame. I had harder time dealing with the ObjectBuilder, but that is probably due to unfamilariarity with the project.
The simple fact that I was able to do a significant change without having to grok the entire ObjectBuilder says quite a but the quality of the code.

 

If you can reverse engineer that from unit-test code, well, good for you and I'm sure your boss will be very happy to hear that you won't create a single bug ever again

This is something that you have repeated several times, and I want to explicitly disagree with this statement: understanding the code doesn't mean no bugs, it means less bugs, for sure, but no zero bugs. Most bugs occur not because you misunderstand what the code does, but because of simple mistake (if instead of if not, for instance) or not considering this particular scenario (who would ever try to paste 2Mb of text here?). Understanding helps reduce it, but it can't eliminate it. I doubt you can make the claim that LLBLGen has no bugs.

What I find a little funny is that you apparently forget which kind of comments were placed inside the nhibernate sourcecode before it went v1.0: things like "// I have no idea what this does" or "// why is this done here?" or similar comments. Apparently, the people who ported the hibernate code over to .NET didn't understand how it worked by simply looking at the code AND with all the unittests in mind.

There are similar comments there right now, and they exists in the original Hibernate source code as well. I will freely admit that there are parts of NHibernate that I have no idea how they work. What I do know is that this lack of knowledge about the way some parts work has not hinder my ability to work with NHibernate or extend it.

By digging into sourcecode and understanding what it precisely does already takes a lot of time as you have to parse and interpret every line in the code and REMEMBER the state of the variables it touches! Can you do that in your head? I can't.

That is not something that I can do, due to this limitation, I am working in a way that ensures that I do not need to understand the entire system and the implications of each and every decision at any given one point. I consider this approach a best practice, because this means that I can work on a piece of code without having to deal with the implications of dozens other components being affected. Documentation wouldn't help here, unless I would have a paragraph per line of code, and keep them in sync at all times, and remember to read it at all times.

Add to that the wide range of decisions one has to make to build a system like that and with just the end-result in your hand it's a hell of a job to come to the level where you understand why things are done that way.

I disagree, I can determain intent from code and from tests, and I can change the code and see tests breaks if necessary. That is much faster way then trying to analyze the flow of each instruction in the application.

Typical example: saving an entity graph recursively in the right order. That's a heck of a complex pipeline, with several algorithms processing the data after eachother, and the code branches out to a lot of different subsystems.

If one can determine by JUST LOOKING AT THE CODE why it is designed the way it is, more power to him/her, but I definitely won't be able to do so. What I find surprising is that some people apparently think they can, with no more info than the end result of the code.

You wouldn't be able to understand that from the end result of the code, but having a test in place will allow you to walk through that in isolation, if needed, and understand what is going on. Here is a secret, I can usually understand what NHibernate is doing in such scenarios without looking at the code. Because the logic is fairly straight-forward to understand (but not to implement). I taught myself NHibernate by building the NHibernate Query Analyzer, no documentation, very little help from other people at the time, but a lot of going through the code and grokking the way it works.

What I find even more surprising is that it apparently is a GOOD thing that there's no documentation.

No, what I am saying is that I would rather have a good code with unit tests than code that has extensive documentation. Again, I am not documentation at the right level, high level architecture, broad implementation notes, build scripts documentation. To go further from that seems to be to get to the point of diminishing returns.

It apparently is EASIER to read code, interpret every line, remember every variable's state it touches, follow call graphs all over the place, write down pre-/post- conditions along the way

That is the part that I think we disagree about, I don't need to do that.

Perhaps they're payed by the hour

Around 90% of the time that I spent on NHibernate is time that goes out of my own free time, not paid for anyone. You can rest assure that I care a lot about not wasting my own time. This is the real world, and good, unit tested, code is maintainable, proved by the fact that people goes in and make safe changes to it, without having to load all the premutations of the system to your head.

time to read 1 min | 133 words

I just needed to find an answer to a question in MonoRail. MonoRail code base is over 75,000 lines of code, including all the supporting projects and tests. Castle.MonoRail.Framework has about 37,700 lines of code. The question was something that I never really thought about, and had no idea where to start looking at. I opened the solution and started hunting. It took about five or six minutes to find the correct spot, another two to verify my assumption and be shocked that there is a private method there :-) and then I was done.

Under ten minutes, to find a question that I never thought about in a significant code base. ReSharper helps, of course, but nothing beats well structured code for maintainability.

Oh, and MonoRail has very little implementation documentation.

FUTURE POSTS

  1. The null check that didn't check for nulls - 9 hours from now

There are posts all the way to Apr 28, 2025

RECENT SERIES

  1. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  2. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
  3. RavenDB 7.1 (6):
    18 Mar 2025 - One IO Ring to rule them all
  4. RavenDB 7.0 Released (4):
    07 Mar 2025 - Moving to NLog
  5. Challenge (77):
    03 Feb 2025 - Giving file system developer ulcer
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}