Reviewing pull requests with large number of commits
I have to go over a fairly compress pull request, adding a pretty big feature. The feature, if you care, is compressing data in the leaf pages of a B+Tree for use in Map/Reduce entries storage. The current and non final results are reducing storage costs from about 650MB to 32MB, so we are pretty happy with it, but this isn’t the topic of this post.
At this stage, the feature is not complete, but it is ready for initial review, so we created a temporary PR to review it, here is a partial view:
There are over 30 commits over a period of three weeks there, and they represent a lot of experimentation as we go along. This is fine & proper as far as we are concerned, since we this represent quite a bit of work, and we certainly want to be able to track it. But since this isn’t completed, the commit history is long, which means that reviewing it one commit at a time is not trivial. In particular, there are a lot of back and forth changes in the same place, and I don’t want to review code that has been changed.
Luckily, git make it easy to change that view.
I pull the changes to my local branch, then reset the log to the last commit before the merge, then I just squashed all those commit into a single one. You can check it out here.
The final result is:
Which is much nicer way to go about reviewing this, since I can look at all the changes in one shot, and more to the point, I can make comments on the changes using our normal workflow.
Comments
Because you're mentioning PRs, I'm assuming you're using GitHub. They have a tab called 'Files Changed' where you can view all the changes regardless of how many commits there are, without the need to squash and deform history. Also you can comment on them and start reviews as you normally would.
Erik, Yes, that is nice for smaller commits. For larger ones, I want to also look at the code in place, bounce back & forth, etc.
I agree with Oren on this. GitHub PR review is fine for small trivial changes, but I often want to follow references to see how methods are used or to refresh my memory on what a method call is doing. GitHub 'File Changes' doesn't really give me the full context of the change.
@Joe: You are right that GitHub doesn't support following references, so checking out the code can definitely be worthwhile to review, but you don't need to squash the commits to do that. Just check out the branch unsquashed. And as Erik mentions, you can still see all changes packaged together in GitHubs view.. so what is squashing the commits actually helping in this case?
Comment preview