ConcurrentDIctionary.GetOrAdd may call the valueFactory method more than once
Originally posted at 3/28/2011
When you assume, you are making an ass out of yourself, you stupid moronic idiot with no sense whatsoever. The ass in question, if anyone cares to think about it, is yours truly.
Let us take a look at the following code snippet:
var concurentDictionary = new ConcurrentDictionary<int, int>(); var w = new ManualResetEvent(false); int timedCalled = 0; var threads = new List<Thread>(); for (int i = 0; i < Environment.ProcessorCount; i++) { threads.Add(new Thread(() => { w.WaitOne(); concurentDictionary.GetOrAdd(1, i1 => { Interlocked.Increment(ref timedCalled); return 1; }); })); threads.Last().Start(); } w.Set();//release all threads to start at the same time foreach (var thread in threads) { thread.Join(); } Console.WriteLine(timedCalled);
What would you say would be the output of this code?
Well, I assumes that it would behave in an atomic fashion, that the implementation is something like:
if(TryGetValue(key, out value)) return value; lock(this) { if(TryGetValue(key, out value)) return value; AddValue( key, valueFactory()); }
Of course, the whole point of the ConcurentDictionary is that there are no locks. Well, that is nice, except that because I assumed that the call is only made once, I called that with a function that had side effects when called twice.
That was a pure hell to figure out, because in my mind, of course that there was no error with this function.
Comments
I always try to remember...
http://www.youtube.com/watch?v=6hrLj8QEAgI
Ayende, they're not totally lock free... But the delegate executes before the lock.
BTW, is there any reason you're using Threads instead of Tasks/TPL, or is just for demonstration?
Linkgoron,
To make sure that this actually run on separate threads.
There is a concept of "publish once" and "execute once" in Lazy <t, described here - msdn.microsoft.com/.../...azythreadsafetymode.aspx>
If you only need "publish once" guarantee, framework can make more performance optimizations. Unfortunately this does not seem to have been documented for ConfurrentDictionary
The reason for this is that user code should not be called while holding a lock (deadlock-prone. who knows if you function will call across threads and deadlock with itself?). Therefore the CDict will release the lock before calling your delegate. (They are internally using normal locks, but striped by hash code).
I don't know what you mean by "There are no locks" - WaitOne() is a lock isn't it, or do you mean something else?
Oh, Benny Hill, awesome :)
That is a nasty one to remember.
Peter, ayende was referring to the implementation of ConcurrentDictionary
WTF?!
I think MS should remove this method from ConcurrentDictionary.
Use ConcurrentDictionary <tkey,>
When this happens, two instances of Lazy <tvalue will be created... instead of two instances of your value type.
I meant, ConcurrentDictionary[TKey, Lazy[TValue] ]
It would be nice if the documentation of GetOrAdd contains a remark telling you that the factory function could be called more than one times. I agree that we should never assume something :), but this violates the Principle of least astonishment.
Thanks Ayende for pointed this out.
Comment preview