[Unstable code] Why timeouts doesn't mean squat...

time to read 4 min | 608 words

Because they aren't helpful for the pathological cases. Let us take this simple example:

[ServiceContract]
public interface IFoo
{
	[OperationContract]
	string GetMessage();
}

var stopwatch = Stopwatch.StartNew();
var channel = ChannelFactory<IFoo>.CreateChannel(
	new BasicHttpBinding
	{
		SendTimeout = TimeSpan.FromSeconds(1), 
		ReceiveTimeout = TimeSpan.FromSeconds(1),
		OpenTimeout = TimeSpan.FromSeconds(1),
                CloseTimeout = TimeSpan.FromSeconds(1)
	},
	new EndpointAddress("http://localhost:6547/bar"));

var message = channel.GetMessage();

stopwatch.Stop();
Console.WriteLine("Got message in {0}ms", stopwatch.ElapsedMilliseconds);

On the face of it, it looks like we are safe from the point of view of timeouts, right? We set all the timeout settings that are there. At most, we will spend a second waiting for the message, and get a time out exception if we fail there.

Here is a simple way to make this code hang for a minute (more after the code):

namespace ConsoleApplication1
{
	using System;
	using System.Linq;
	using System.Diagnostics;
	using System.IO;
	using System.Net;
	using System.ServiceModel;
	using System.Threading;

	class Program
	{
		static void Main(string[] args)
		{
			var host = new ServiceHost(typeof(FooImpl), 
				new Uri("http://localhost/foo"));
			host.AddServiceEndpoint(typeof(IFoo), 
				new BasicHttpBinding(), 
				new Uri("http://localhost/foo"));
			host.Open();

			new SlowFirewall();

			var stopwatch = Stopwatch.StartNew();
			var channel = ChannelFactory<IFoo>.CreateChannel(
				new BasicHttpBinding
				{
					SendTimeout = TimeSpan.FromSeconds(1), 
					ReceiveTimeout = TimeSpan.FromSeconds(1),
					OpenTimeout = TimeSpan.FromSeconds(1),
                 			CloseTimeout = TimeSpan.FromSeconds(1)
				},
				new EndpointAddress("http://localhost:6547/bar"));
			
			var message = channel.GetMessage();
			
			stopwatch.Stop();
			Console.WriteLine("Got message in {0}ms", stopwatch.ElapsedMilliseconds);


			host.Close();
		}
	}

	[ServiceContract]
	public interface IFoo
	{
		[OperationContract]
		string GetMessage();
	}

	public class FooImpl : IFoo
	{
		public string GetMessage()
		{
			return new string('*', 5000);
		}
	}

	public class SlowFirewall
	{
		private readonly HttpListener listener;

		public SlowFirewall()
		{
			listener = new HttpListener();
			listener.Prefixes.Add("http://localhost:6547/bar/");
			listener.Start();
			listener.BeginGetContext(OnGetContext, null);
		}

		private void OnGetContext(IAsyncResult ar)
		{
			var context = listener.EndGetContext(ar);
			var request = WebRequest.Create("http://localhost/foo");
			request.Method = context.Request.HttpMethod;
			request.ContentType = context.Request.ContentType;
			var specialHeaders = new[] { "Connection", "Content-Length", 
"Host", "Content-Type", "Expect" }; foreach (string header in context.Request.Headers) { if (specialHeaders.Contains(header)) continue; request.Headers[header] = context.Request.Headers[header]; } var buffer = new byte[context.Request.ContentLength64]; ReadAll(buffer, context.Request.InputStream); using (var stream = request.GetRequestStream()) { stream.Write(buffer, 0, buffer.Length); } using (var response = request.GetResponse()) using (var responseStream = response.GetResponseStream()) { buffer = new byte[response.ContentLength]; ReadAll(buffer, responseStream); foreach (string header in response.Headers) { if (specialHeaders.Contains(header)) continue; context.Response.Headers[header] = response.Headers[header]; } context.Response.ContentType = response.ContentType; int i = 0; foreach (var b in buffer) { context.Response.OutputStream.WriteByte(b); context.Response.OutputStream.Flush(); Thread.Sleep(10); Console.WriteLine(i++); } context.Response.Close(); } } private void ReadAll(byte[] buffer, Stream stream) { int current = 0; while (current < buffer.Length) { int read = stream.Read(buffer, current, buffer.Length - current); current += read; } } } }

This problem means that even supposedly safe code, which has taken care of specifying timeouts properly is not safe from blocking because of network issues. Exactly the thing we specified the timeouts to avoid. I should note that this sample code is still at a very high level. There is a lot of things that you can do at all levels of the network stack to play havoc with your code.

As an aside, what book am I re-reading?