Testing interleaved async file writing

time to read 3 min | 516 words

Here is another interesting aspect that I run into when thinking about writing to files. What will happen if I start two async write operations? Will they get interleaved? Will they get interleaved? Will I get errors when using this?

Here is using a single stream:

var path = "test.txt";

if (File.Exists(path))
	File.Delete(path);

var handles = new List<WaitHandle>();
using (
	var stream = new FileStream(path, FileMode.CreateNew, FileAccess.Write, FileShare.None, 0x1000,
								 FileOptions.Asynchronous))
{
	for (int i = 0; i < 64; i++)
	{
		var handle = new ManualResetEvent(false);
		var bytes = Encoding.UTF8.GetBytes( i + Environment.NewLine);
		stream.BeginWrite(bytes, 0, bytes.Length, delegate(IAsyncResult ar)
		{
			stream.EndWrite(ar);
			handle.Set();
		}, stream);
		handles.Add(handle);
	}
	WaitHandle.WaitAll(handles.ToArray());
	stream.Flush();

}

This worked perfectly, the writes are written in a sequential manner, with no interleaving.

The situation changes somewhat when we work with several streams, instead. Here is the code:

var path = "test.txt";

if (File.Exists(path))
	File.Delete(path);

var handles = new List<WaitHandle>();

for (int i = 0; i < 64; i++)
{
	var stream = new FileStream(path, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite, 0x1000,
	                            FileOptions.Asynchronous);
	var handle = new ManualResetEvent(false);
	var bytes = Encoding.UTF8.GetBytes(i + Environment.NewLine);
	stream.BeginWrite(bytes, 0, bytes.Length, delegate(IAsyncResult ar)
	{
		stream.EndWrite(ar);
		stream.Flush();
		stream.Dispose();
		handle.Set();
	}, stream);
	handles.Add(handle);
}
WaitHandle.WaitAll(handles.ToArray());

The output of this code on my machine is: 63

Although I am pretty sure that it is possible to get other things as well.  But considering that the way IO operations work in generate ( Write( position, buffer) ), that makes a lot of sense.

Interestingly, in regards to the single stream version, the documentation states:

Multiple simultaneous asynchronous requests render the request completion order uncertain.

It doesn't say anything about the actual write order, however, the completion order is not an issue, as far as I am concerned, but the order of the data on disk is very important.

Another interesting question was how a single file stream work with multiple threads calling BeginWrite in parallel. Here is the code:

var path = "test.txt";

if (File.Exists(path))
	File.Delete(path);

var handles = new List<WaitHandle>();
using (
	var stream = new FileStream(path, FileMode.CreateNew, FileAccess.Write, FileShare.None, 0x1000,
								 FileOptions.Asynchronous))
{
	for (int i = 0; i < 64; i++)
	{
		var handle = new ManualResetEvent(false);
		var bytes = Encoding.UTF8.GetBytes(i + Environment.NewLine);
		ThreadPool.QueueUserWorkItem(delegate
		{
			Thread.Sleep(10);
			stream.BeginWrite(bytes, 0, bytes.Length, delegate(IAsyncResult ar)
			{
				stream.EndWrite(ar);
				handle.Set();
			}, stream);
		});
		handles.Add(handle);
	}
	WaitHandle.WaitAll(handles.ToArray());
	stream.Flush();

}

The result of this code looks like this:

image

Obviously, we have a race condition with the known position of the stream. Adding a lock for the stream.BeginWrite results in this:

image

I am actually fine with this, because as far as I can tell, this might very well be the actual execution order.

Another observation that is very important is the fact that the Position property of the stream is update immediately when you can BeginWrite().

That was a somewhat random walk over the BeginWrite implementation, I admit, but I hope it will make sense soon.