How to reproduce an occasionally failing test?
One of the worst possible things that can happen is a test that fails, sometimes. Not always, and never under the debugger, but it fails.
It tells you that there is a bug, but doesn’t give you the tool to find it:
Usually, this is an indication of a problem with the code that is exposed through multi threading. I found the following approach pretty useful in digging those bastards out:
static void Main() { for (int i = 0; i < 100; i++) { using (var x = new Raven.Tests.Indexes.QueryingOnDefaultIndex()) { x.CanPageOverDefaultIndex(); Console.Write("."); } } }
Yes, it is trivial, but you would be surprised how often I see people throwing their hands in despair over issues like this.
Comments
I can offer a funny story about occasionally failing tests. One of the tests in my team a couple of years ago was failing like once a day. Really random. It either did on a developer workstation or the build server.
One day I got pissed at it and opened the code... I discovered an elaborate scheme to select a random table cell to use for testing. Of course it had a bug that manifested only when a specific cell was chosen by the randomizer. Our very own Russian roulette :-)
Be afraid of people that use random numbers in unit tests. Be very afraid!
Indeed, using random numbers in unit tests is bad; use arbitrary numbers instead.
@Ayende, you can probably debug the tests by using your Main() function and setting the debugger to stop on exceptions; or would even having a debugger connected stop the test from working in this case?
'Usually, this is an indication of a problem with the code that is exposed through multi threading'
Another reason can be using DateTime.Now and its ticks modulo sth in a legacy code;-)
Google Microsoft Chess, haven't had time to check it out myself yet but could help with these issues. I pasted the url in a previous comment but it got flagged as spam.
The problem is that event with a loop you might not hit the issue, there are so many variables to consider like cpu architecture, system load etc when it comes to mutli threading bugs.
Love it!
I debug this way; I'm not a debugger expert, so I use Console.WriteLine() frequently, and trace through the code in my head. I have found that "occasionally failing" tests are almost always related to threading code.
I have learned to use an abstraction for launching threads and utilize a synchronized version of the thread starter for unit testing.
Albert Einstein once said "The definition of insanity is doing the same thing over and over again and expecting different results". Who's insane?
I believe that suspending an resuming threads at random (done by a dedicated controller thread) can help diagnose those issues. It is a piece of write-once infrastructure.
I once faced a problem where an unknown test failed once in a blue moon with a -100 error (basically: Exception occurred on secondary thread, causing NUnit to exit abnormally without any information).
The problem was that the tests were soooooooooo slow. I think when I rolled into town, the 10k 'unit' tests would take well over 10 minutes to run, but the -100 popped up once in a blue moon. As a result, it'd take literally days to make the failure occur and, when you did, there'd be no chance of knowing which assembly, never mind test, was causing it. I was so frustrated by it that I ended up writing a script to perform a binary search. It'd omit half of the test assemblies, run the other half in a loop for hours, analyse the exit codes and dump out the assemblies present when the failure occured. It'd then take the subset of failing assemblies and repeat the process.
I set it running on a spare machine over the weekend and came back to find the assembly had been identified. After that it was just a case of running half of the tests in the assembly in a loop and repeating the process. It took days, but we eventually found the culprit.
This is why paying attention to broken windows is a 'good thing' for your mental health... I just about killed someone :)
[Repeat(10000)] and [ThreadedRepeat(100)] are generally very effective.
@Ayende, "Usually, this is an indication of a problem with the code that is exposed through multi threading."
What do you do when the exception is in the mock library and not your code? I am passing a Moq service mock into multiple threads. It intermittently blows up because it seems that that Setup (Expect) delegates use a non-synchronized List (you get Index exception on the List).
Now, I know my object under test works because I never get an exception from there. What should I do, stop doing the mutli-threaded tests (single threaded tests all work fine), create a custom thread-safe mock for myself, or...?
Rob,
Use a mocking library that is thread safe (Rhino Mocks)?
I was hoping you would say that. I have always used Rhino in the past but on this client site they are using Unity+Moq.
Hmmmm.... It may be time to change back before we have too many unit tests written, or just use Rhino for my multi-threaded ones.
Thanks
Comment preview