Fixing the heisenbug
I just run the Rhino Service Bus test suite which resulted in the test suite hanging. This is where it most often hang.
1: [Fact]
2: public void A_message_that_fails_processing_should_go_back_to_queue_on_transactional_queue()
3: {
4: TransactionalTransport.MessageArrived += ThrowOnFirstAction();
5:
6: TransactionalTransport.Send(TransactionalTestQueueUri, DateTime.Today);
7:
8: gotFirstMessage.WaitOne();
9:
10: Assert.NotNull(transactionalQueue.Peek());
11:
12: gotSecondMessage.Set();
13: }
And it kept getting hang on line 8. Now, it worked on other machines, and when I run this on its own, it worked just fine. I knew that the issue was probably a matter of test interaction, but how could I debug this?
When running under the debugger, it didn’t reproduce itself.
It was pretty consistent in where it failed, however, and that gave me an opening. Many people are not familiar with the ability to interact with the debugger from your code. But the .NET framework contains System.Diagnostics.Debugger class. And that showed me the path.
I put Debugger.Launch() as the first line of the test, and run the tests without the debugger. When this particular test was executed, I broke into the debugger, and I was able to check what the current state of the system was. As it turned out, I had another bus instance reading from the queue. That was because another few tests weren’t disposing of their buses properly. I fixed that and the test suite run normally.
Comments
This is hilarious. I'm just starting to look at Rhino ServiceBus and the first thing I did was run the full unit test. Guess where it hung?
Thanks for the post and the code!
The bug is the global state.
sorry to be asking this question on this post but the comments on the original are closed. I have downloaded the rhino-tools trunk and looking into DSL. The Scheduling DSL test cases have been marked ignore. If i uncomment the boo file does not compile, can you please let me know what the problem could be?
job,
"then" is now a keyword in the Boo language.
And you added a test to test that your tests were properly disposing their bus after testing? :)
Comment preview