Patterns for Distributed Hash TablesRange Queries

time to read 8 min | 1406 words

Next to last, in this series on using Distributed Hash Tables (DHT)!

The previous ones are:

In the previous post, we covered how we can create an index that will allow us to get all items that exactly match a specific value. However, we have seen that trying to query by range is not possible using this approach.

This white paper (PDF) covers how we can use a Trie on top of a DHT. This is an intensely interesting subject. However, since I do not intend to regurgitate either the article or the discussion on tries, I would like to show an implementation of a trie on a DHT, instead. I strongly suggest reading the article, at least.

Note about the implementation, yes, it is not as nice as it can be, probably inefficient and don't manage a lot of the things that the white paper calls for (like splitting nodes), I am aware of that, but I choose to go with a simple implementation to demonstrate the concept. If you would like to extend the implementation to show how it should be done, I would be very interested in seeing this.

The code for the Trie implementation can be found in the scratch pad.

First, we have our (mocked) distributed cache:

public class DistributedCache
{
	private static readonly Hashtable[] distributedCache = new[]
	{
		new Hashtable(), new Hashtable(), new Hashtable(),
		new Hashtable(), new Hashtable(), new Hashtable(),
		new Hashtable(), new Hashtable(), new Hashtable(),
		new Hashtable(), new Hashtable(), new Hashtable(),
		new Hashtable(), new Hashtable(), new Hashtable(),
		new Hashtable(), new Hashtable(), new Hashtable(),
	};

	public static void PutInCache(string key, object value)
	{
		var table = distributedCache[Math.Abs(key.GetHashCode()) % distributedCache.Length];
		table[key] = value;
	}

	public static object GetFromCache(string key)
	{
		return distributedCache[Math.Abs(key.GetHashCode()) % distributedCache.Length][key];
	}
}

Next we define a Trie Node:

public class TrieNode
{
	public TrieNode()
	{
		Children = new SortedList<string,string>();
	}

	public string PartialValue { get; set; }
	public SortedList<string, string> Children { get; set; }
	public string KeyInCache { get; set; }
}

And now we need to take a look at Trie Manager itself. It is a bit complex, so we will look at each piece in isolation. We will start from the simple methods:

public static void PutInCacheAndUpdateTrie(KeyValuePair<string, string> word)
{
	DistributedCache.PutInCache(word.Key, word.Value);
	UpdateTrie(word);
}

public static string CreateNodeKey(string val)
{
	return "Node: " + val;
}

Putting an item in the cache will physically put it in the cache, and also update the trie. Each node of the trie is prefixed with "Node: " and the actual key it represent. The Update Trie method is interesting:

private static void UpdateTrie(KeyValuePair<string, string> word)
{
	string currentKey = word.Value;
	var node = DistributedCache.GetFromCache(CreateNodeKey(currentKey)) as TrieNode;
	while (node == null)
	{
		// this code assumes that the empty node is already in the cache
		currentKey = currentKey.Substring(0, currentKey.Length - 1);
		node = DistributedCache.GetFromCache(CreateNodeKey(currentKey)) as TrieNode;
	}
	if (currentKey == word.Value)// value already in trie, overwrite it
	{
		node.KeyInCache = word.Key;
		return;
	}
	// need to create child node(s)
	while (currentKey != word.Value)
	{
		currentKey = currentKey + word.Value[currentKey.Length];

		var childNode = new TrieNode
		{
			PartialValue = currentKey,
			KeyInCache = currentKey == word.Value ? word.Key : null
		};
		string value = CreateNodeKey(currentKey);
		node.Children.Add(value, value);
		DistributedCache.PutInCache(CreateNodeKey(currentKey), childNode);
		node = childNode;
	}
}

The overall structure is:

  • Find the highest node that match the current value
  • If it exists, overrwrite the value (we allow only a single key per value in this impl.)
  • If it doesn't exists, create all the nodes from the root (which always exists) to the current value.

This is basically it, all we have left is actually querying the trie:

public static IEnumerable<KeyValuePair<string, string>> GetFromCacheInRange(string start, string end)
{
	var lcd = GetLowestCommonDenominator(start, end);
	var cache = (TrieNode)DistributedCache.GetFromCache(CreateNodeKey(lcd));
	if (cache == null)
		return new KeyValuePair<string, string>[0];
	int indexOfFirstKey = cache.Children.IndexOfKey(lcd);
	if (indexOfFirstKey == -1)
		indexOfFirstKey = 0;
	var keys = new List<string>();
	if (IsMatchingNode(cache, start, end))
		keys.Add(cache.KeyInCache);
	for (int i = indexOfFirstKey; i < cache.Children.Count; i++)
	{
		if (ProcessNode(keys, cache.Children.Values[i], start, end) == false)
			break;
	}
	var result = new Dictionary<string, string>();
	foreach (var s in keys)
	{
		result.Add(s, DistributedCache.GetFromCache(s).ToString());
	}
	return result;
}

Here we find the maximum shared string in the start of both start and end, then try to get the relevant node for it. Then we iterate over all the direct children of the node and add the items to the list of keys. Finally, we resolve the list of keys from the cache. Finding the maximum shared string start is as simple as this:

private static string GetLowestCommonDenominator(string x, string y)
{
	var sb = new StringBuilder();
	for (int i = 0; i < x.Length && i < y.Length; i++)
	{
		if (x[i] != y[i])
			break;
		sb.Append(x[i]);
	}
	return sb.ToString();
}

One important implementation detail is determining if a partial value string is between the start and end delimiters that we are given. We do this using IsMatchingNode method:

private static bool IsMatchingNode(TrieNode cache, string start, string end)
{
	if (cache.KeyInCache == null)
		return false;

	bool greaterThanOrEqStart;
	string partialValue = cache.PartialValue;
	if (partialValue.Length > start.Length)
		greaterThanOrEqStart = partialValue.Substring(0, start.Length).CompareTo(start) >= 0;
	else
		greaterThanOrEqStart = partialValue.CompareTo(start) >= 0;
	if (greaterThanOrEqStart == false)
		return false;
	return StillInRange(partialValue, end);
}

private static bool StillInRange(string partialValue, string end)
{
	bool smallerOrEqEnd;
	if (partialValue.Length > end.Length)
		smallerOrEqEnd = partialValue.Substring(0, end.Length).CompareTo(end) <= 0;
	else
		smallerOrEqEnd = partialValue.CompareTo(end) <= 0;
	return smallerOrEqEnd;
}

Those are ugly, I'll admit.

Now the only thing that we have left is how we process a node recursively:

private static bool ProcessNode(ICollection<string> keys, 
	string nameOfItemInCache, string start, string end)
{
	var node = (TrieNode)DistributedCache.GetFromCache(nameOfItemInCache);
	if (IsMatchingNode(node, start, end))
		keys.Add(node.KeyInCache);
	if (StillInRange(node.PartialValue, end) == false)
		return false;
	foreach (var value in node.Children.Values)
	{
		if (ProcessNode(keys, value, start, end) == false)
			return false;
	}
	return true;
}

Given all of that, we can now write the following client code:

public static void Main()
{
	var words = new Dictionary<string, string>
	{
		{"foo", "hello"},
		{"bar", "beer"},
		{"fubar", "snow"},
		{"welcome", "black"},
		{"forward", "rose"},
		{"next", "lion"},
		{"exit", "hungry"},
		{"seek", "monk"},
	};

	DistributedCache.PutInCache(TrieManager.CreateNodeKey(""), new TrieNode { KeyInCache = null, PartialValue = "" });

	foreach (var word in words)
	{
		TrieManager.PutInCacheAndUpdateTrie(word);
	}
	var items = TrieManager.GetFromCacheInRange("ba", "bz");
	if (items == null)
		return;
	foreach (var item in items)
	{
		Console.WriteLine(item.Key + " = " + item.Value);
	}
}

With the output being:

bar = beer
welcome = black

This may not seem like an impressive fit, but imagine if we stored things like Users in the DHT, not just a string. This would allow us to perform a range query over the user names and get the actual users (changing the sample code to do this is left as an exercise for the reader).

The code for the Trie implementation can be found in the scratch pad.

More posts in "Patterns for Distributed Hash Tables" series:

  1. (09 Aug 2008) Range Queries
  2. (09 Aug 2008) Lookup by property
  3. (09 Aug 2008) Cheap Cross Item Transactions
  4. (09 Aug 2008) Item Groups
  5. (08 Aug 2008) Locality