Patch management approaches using centralized SCM

time to read 3 min | 521 words

Without getting to the centralized vs. decentralized SCM argument (I understand the differences, I just don't grok them), patch management is important in many scenarios. Contributing to OSS projects is a major one, I admit, but I have previous used these techniques to be able to take emergency fixes on productions and merge them into the development trunk.

The question came up in the NHibernate Contrib mailing list, and Josh Robb has commented on that at length. I thought that it would be a good idea to take that and expand on this a bit.

The problem:

We want to submit a changeset to a project, without having direct access to its source control. The solution is to generate a patch and send it to the destination.

So far, it is simple. It gets complex when you need to deal with more than a single changeset that hasn't been merged to the root.

Let us say that we have several changesets that we have generated. Let us see how we treat them, according to the different scenarios we encounter. A scenario, in this case, is the dependence between the changesets.

Scenario #1 - No dependencies between the patches.

This is a common scenario if you are working on several things in parallel. A classic case is when you are fixing several bugs. In most cases, the changes in each bug fix are unrelated to each other, and can be applied independently.

In this case, you usually generate separate patches for each changeset. This allow to evaluate each patch in isolation, which significantly ease the acceptance of each patch.

This lead us to the First Rule of Patches: keep them small. It is easier to go through seven small patches than 1 big one.

Scenario #2 - No dependencies between the patches, but touching the same files.

This is the case if two changesets has touched the same file, but there is no logical dependency between the patches. In this case, we still want to get separate patches. Usually, I generate one patch, revert to base, work on the second one, generate a patch, etc...

Scenario #3 - Logical dependencies between the patches

One patch relies on behavior / API created in another patch. In this case, the best solution is to create a patch for each distinct behavior, and number them, so it is still possible to review them in isolation, but the merge order is clear.

Scenario #4 - Several revisions of the same patch

In this case, you sumbitted a patch, but continued to work on the same feature/bug and have a new patch before the first one was applied. In this case, the later patch supercede the previous one, which can now be discarded. You need to be careful with this scenario, because too much disconnected work can create huge patches. It is better to review you work and see if you are in situation #3 or really situation #4.

Anything that I missed?