Deadlocking with the TPL, how to

time to read 21 min | 4088 words

As I mentioned, I run into a very nasty issue with the TPL. I am not sure if it is me doing things wrong, or an actual issue.

Let us look at the code, shall we?

We start with a very simple code:

   1: public class AsyncEvent
   2: {
   3:     private volatile TaskCompletionSource<object> tcs = new TaskCompletionSource<object>();
   4:     
   5:     public Task WaitAsync()
   6:     {
   7:         return tcs.Task;
   8:     }
   9:  
  10:     public void PulseAll()
  11:     {
  12:         var taskCompletionSource = tcs;
  13:         tcs = new TaskCompletionSource<object>();
  14:         taskCompletionSource.SetResult(null);
  15:     }
  16: }

This is effectively an auto reset event. All the waiters will be released when the PulseAll it called. Then we have this runner, which just execute work:

   1: public class Runner : IDisposable
   2: {
   3:     private readonly ConcurrentQueue<TaskCompletionSource<object>> items =
   4:         new ConcurrentQueue<TaskCompletionSource<object>>();
   5:     private readonly Task<Task> _bg;
   6:     private readonly AsyncEvent _event = new AsyncEvent();
   7:     private volatile bool _done;
   8:  
   9:     public Runner()
  10:     {
  11:         _bg = Task.Factory.StartNew(() => Background());
  12:     }
  13:  
  14:     private async Task Background()
  15:     {
  16:         while (_done == false)
  17:         {
  18:             TaskCompletionSource<object> result;
  19:             if (items.TryDequeue(out result) == false)
  20:             {
  21:                 await _event.WaitAsync();
  22:                 continue;
  23:             }
  24:  
  25:             //work here, note that we do NOT use await!
  26:  
  27:             result.SetResult(null);
  28:         }
  29:     }
  30:  
  31:     public Task AddWork()
  32:     {
  33:         var tcs = new TaskCompletionSource<object>();
  34:         items.Enqueue(tcs);
  35:  
  36:         _event.PulseAll();
  37:  
  38:         return tcs.Task;
  39:     }
  40:  
  41:     public void Dispose()
  42:     {
  43:         _done = true;
  44:         _event.PulseAll();
  45:         _bg.Wait();
  46:     }
  47: }

And finally, the code that causes the problem:

   1: public static async Task Run()
   2: {
   3:     using (var runner = new Runner())
   4:     {
   5:         await runner.AddWork();
   6:     }
   7: }

So far, it is all pretty innocent, I think you would agree. But this cause hangs with a dead lock. Here is why:

image

Because tasks can share threads, we are in the Background task thread, and we are trying to wait on that background task completion.

Result, deadlock.

If I add:

   1: await Task.Yield();

Because that forces this method to be completed in another thread, but that looks more like something that you add after you discover the bug, to be honest.