In my previous post, we have increased the capacity of the cluster by moving all new work to the new set of servers. In this post, I want to deal with a slightly harder problem, how to handle it when it isn’t new data that is causing the issue, but existing data. So we can’t just throw a new server, but need to actually move data between nodes.
We started with the following configuration:
var shards = new Dictionary<string, IDocumentStore> { {"Shared", new DocumentStore {Url ="http://rvn1:8080", DefaultDatabase = "Shared"}}, {"EU", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe"}}, {"NA", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica"}}, };
And what we want is to add another server for EU and NA. Our new topology would be:
var shards = new Dictionary<string, IDocumentStore> { {"Shared", new DocumentStore {Url ="http://rvn1:8080", DefaultDatabase = "Shared"}}, {"EU1", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe1"}}, {"NA1", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica1"}}, {"EU2", new DocumentStore {Url = "http://rvn4:8080", DefaultDatabase = "Europe2"}}, {"NA2", new DocumentStore {Url = "http://rvn5:8080", DefaultDatabase = "NorthAmerica2"}}, };
There are a couple of things that we need to pay attention to. First, we no longer use the EU / NA shard keys, they have been removed in favor of EU1 & EU2 / NA1 & NA2. We’ll also change the sharding configuration so it would split the new data between the two new nodes for each region evenly (see previous post for the details on exactly how this is done). But what about the existing data? We need to have some way of actually moving the data. That is when our ops tools come into play.
We use the smuggler to move the data between the servers:
Raven.Smuggler.exe between http://rvn2:8080 http://rvn2:8080 --database=Europe --database2=Europe1 --transform-file=transform-1.js --incremental Raven.Smuggler.exe between http://rvn2:8080 http://rvn4:8080 --database=Europe --database2=Europe2 --transform-file=transform-2.js --incremental Raven.Smuggler.exe between http://rvn3:8080 http://rvn3:8080 --database=NorthAmerica --database2=NorthAmerica1 --transform-file=transform-1.js --incremental Raven.Smuggler.exe between http://rvn3:8080 http://rvn5:8080 --database=NorthAmerica --database2=NorthAmerica2 --transform-file=transform-2.js --incremental
The commands are pretty similar, with just the different options, so let us try to figure out what is going on. We are asking the smuggler to move the data between two databases in an incremental fashion, while applying a transform script. The transform-1.js file looks like this:
function(doc) { var id = doc['@metadata']['@id']; var node = (parseInt(id.substring(id.lastIndexOf('/')+1)) % 2); if(node == 1) return null; doc["@metadata"]["Raven-Shard-Id"] = doc["@metadata"]["Raven-Shard-Id"] + (node+1); return doc; }
And the tranasform-2.js is exactly the same except that it return early if node is 0. In this way, we are able to split the data into the two new servers.
Note that the reason we use an incremental approach means that we can do this, even if it takes a long while, then the window of time when we switch is very narrow, and require us to only pass the recently changed data.
That still leaves the question of how are we going to deal with old ids. We are still going to have things like “EU/customers/###” in the database, even if those documents are on one of the two new nodes. We handle this, like most low level sharding behaviors, by customizing the sharding strategy. In this case, we modify the PotentialsServersFor(…) method:
public override IList<string> PotentialShardsFor(ShardRequestData requestData) { var potentialShardsFor = base.PotentialShardsFor(requestData); if (potentialShardsFor.Contains("EU")) { potentialShardsFor.Remove("EU"); potentialShardsFor.Add("EU1"); potentialShardsFor.Add("EU2"); } if (potentialShardsFor.Contains("NA")) { potentialShardsFor.Remove("NA"); potentialShardsFor.Add("NA1"); potentialShardsFor.Add("NA2"); } return potentialShardsFor; }
In this case, we are doing a very simple thing, when the default shard resolution strategy detect that we want to go to the old EU node, we’ll tell it to go to both EU1 and EU2. A more comprehensive solution would narrow it down to the exact server, but that depend on how exactly you split the data, and is left as an exercise for the reader.