How to fight race conditions with a single thread
All of a sudden, AMBUSH!
A piece of code that we have been using successfully for the last 4 - 5 months started throwing Null Reference Exception. This exception was one of the impossible ones. That same piece of code would work flawlessly as long as it wasn't doing much, but the moment you tried to do something very heavy (CPU && Memory wise), it started crashing. The crash was always in the same place, but with wildly different (and all correct) inputs.
Running the code in the debugger wasn't much of a help. You would get the exception, find the line and look at it, check all the inputs and see that they are correct. Running the code a second time (either by moving the instruction pointer or by quick eval) would succeed.
That was the point that I was really grateful that I insist that we use debug builds of all our framework stuff (NHibernate, baseline business framework, etc). I started debugging this issue furiously. No success. I stripped it down to just business logic running in a console application, trying to isolate the conditions that cause the error. I ruled out threading issues, IIS wierdness, corrupted memory and wierd state of the moon.
Remember, it wasn't easily repreducable in the debugger, trying to step into the offending code always worked. Eventually, I found myself looking at the following lines:
if(context.Locator.Contains(key))
return context.Locator.Get(key);
//create, store and return value
That was part of a custom instantiation policy for Object Builder.
Can you see the bug now?
Wait for it...
Object Builder uses Week Reference Dictionary to store some stuff. In this case, the object that I needed to be returned. Between the first and second line, a GC collection occurs, renderring this check pointless. The next time I would try to run this, it would correctly realize that it didn't have the value and create it, causing the craziness in debugging it. The fix?
object val = context.Locator.Get(key);
if(val!=null)
return val;
//create, store and return value
I burned over two days over this bug, and I don't think that I was as angry with a bug for a long time.
Comments
Comment preview