I am being stalked by CLR bugs

time to read 3 min | 441 words

imageI just spent several hours tracking down a crashing but in my current project.

The  issue was, quite clearly, a problem with releasing unmanaged resources. So I tightened my control over resources and made absolutely sure that I am releasing everything properly.

I simply could not believe what was going on. I knew what they code is doing, and I knew that what I was getting was flat out impossible.

Yes, I know that we keep saying that, but this bug really is not possible!

The situation is quite clear, during the shutdown process of the application, an unmanaged resource’s finalizer would throw an exception because it wasn’t properly disposed.

The problem? It is most assuredly should be disposed. When debugging through the problem, I found out something extremely strange and worrying. The managed object’s finalizer was running while there were strong references to the object.

You can see that in the attached screenshot (click the image see it in full size).

That is, by the way, when you have the root in a static field, so it cannot be that the whole graph is free.

This is somehow related to threading, because debugging this would often change the way this works, but running without a debugger consistently fails.

The CLR semantics for finalizers clearly state that they can only be run after there are no more strong references to the instance. Cleary, this is not what is going on here.

The only thing that I can think of that can affect this is that there is something really strange going on with app domain unloads.

Now, I can’t figure if this is me being extremely stupid or if this is a real problem. I did manage to create a reproduction of the issue, however, which you can download here.

This is on VMWare Fusion, running Windows 2008 x64, with .Net 3.5 SP1.

To reproduce, start the application in WebDev.WebServer, wait for the page to load, and the close the WebDev.WebServer. If it crashes, you have successfully reproduced the problem.

Update – This is really interesting. both stack traces are operating on the same object, by the way.

image

Adding locking around the finalizer and dispose seems to have made the problem go away.