Fun with the ThreadPool #2: Find the bug

time to read 8 min | 1473 words

This bug caused a production system to grind to a screeching halt, under just the right amount of load. Took a while to figure out what was happening.  I created a small test that shows what the problem is.

Here is a little description:

  • The timer wakes a set of services that need to process input.
  • To avoid re-entrancy issues, I used synchornized methods.
  • Some services does non trivial amount of work.

class Program

{

      static int count = 0;

      static Timer timer;

      static void Main(string[] args)

      {

            timer = new Timer(delegate

            {

                  int temp = Interlocked.Increment(ref count);

                  ThreadPool.QueueUserWorkItem(DoLengthyWork, temp);

                  ThreadPool.QueueUserWorkItem(ReportStatus, temp);

            },null,100,100);

            Console.ReadKey();

      }

 

      [MethodImpl(MethodImplOptions.Synchronized)]

      public static void DoLengthyWork(object counter)

      {

            //do stuff that may fail on re-entry

            Thread.Sleep(1500);

            Console.WriteLine("Finished lengthy work: {0}",  counter);

      }

 

      public static void ReportStatus(object counter)

      {

            Console.WriteLine("Reporting status: {0}", counter);

      }

}

For fun, this takes about a minute to fail...

What is the problem, and what is causing it?