Popular Posts

Monday, March 25, 2013

Git Merge Versus Rebase

Git Merge Versus Rebase

You will see a lot of strange advice in Google searches on merge versus rebase, and how to deal with conflict resolution.    The advice conflicts, so it obviously isn't all good advice.  This link here has it right:


The biggest usability goof in the git interface is just the fact that "git pull" isn't a rebase by default.  Rebase causes the commits to appear such that incoming changes come in in between old origin state and our changes, where we make the edits to apply over top of the incoming changes.  (Read gitguru's article until it makes sense!)

The fact that rebase creates a rewrite of history is feared to be a problem by some developers, but it's not a problem if pre-rebased (and pre-squashed) commits are never pushed up into the master commit.  In other words, you can rebase that which is still local to your machine, or only cloned into a throw-away branch that somebody reviewing your code is using.

The Common Case

Your pull (from your master branch) should always be "git pull --rebase", or similarly "git fetch; git rebase", except for a few special cases.  Doing fetch, rebase separately lets you deal with rebase conflicts in the master (which you should not have when you follow this discipline).  It would also be helpful if git branches were forced to be named like file paths to reflect the history of how branches were created.  Consider this slightly verbose way of working:

#Alice has been asked to add Xmpp and Gopher protocol support
#so we make the tree of branches
git clone git://origina/ManhattanProject
cd ManhattanProject
git checkout master
git checkout -b masterProtocolSupport
git checkout -b masterProtocolSupportXmpp
git checkout masterProtocolSupport
git checkout -b masterProtocolSupportGopher

#Alice is asked to fix an SSL bug
git checkout master
git checkout -b masterFixSSLBug

Each bug is in its own branch, because only a subset of these branches will ultimately be approved to be pushed into the repository.  The clone creates a leaf off of the origin master branch.  Doing a checkout -b creates a leaf off of the currently checked out branch and changes over to that branch.  Alice gets to work, making commits into these branches:


Alice is told that there were bug fixes checked into master and pushed to the origin.  We always follow the rule that we begin at the master and rebase before merge, so that all merges will fastforward.  If there are any conflicts, the conflicts are always resolved in rebases - not in merges.  (This rule is only violated in scenarios where we have to merge trees across different origin servers, where the common ancestor commit is too far in the past.).  This is the default scenario:

git checkout master
git pull --rebase
#we should only get conflicts when rebasing leaf branches - if you get a conflict here, you screwed up
#stop checking into your local copy of master after you fix this mess.
git checkout masterProtocolSupport
git rebase master
git checkout masterProtocolSupportXmpp
#may get conflicts on this rebase - fix them here
git rebase masterProtocolSupport
git commit

git checkout masterProtocolSupportGopher
#fix conflicts here
git rebase masterProtocolSupport

Assuming that the conflicts you get in the Xmpp versus Gopher branch are distinct, this is straightforward.  You then need to figure out what subset of your work will be approved to be merged into the local repository.  So, somebody pulls alice's masterProtocolSupportGopher and masterProtocolSupportXmpp branch.  The reviewer decides that the Gopher support isn't worth the trouble and tells Alice to discard the Gopher work, and push the Xmpp branch.  The SSL fixes are also approved without controversy.  So we need to make a test branch that includes all that is approved:

#start from the root of the tree and just ensure that we rebase before we push in all cases.
#if fastforward merge fails, then fix it in the leaf and merge it up rather than messing with non-leaf
git checkout master
git pull --rebase
git checkout masterProtocolSupport
git rebase master
git checkout masterProtocolSupportXmpp 
git rebase masterProtocolSupport #squash commits after this
git checkout masterFixSSLBug 
git rebase master #squash commits after this
#we are rebased

#make a disposable testmaster branch that merges all approved work
#that we can run tests on.  merge into test rather than master
git checkout master
git checkout -b testmaster
#merge everything down into test
git checkout masterProtocolSupport
git merge masterProtocolSupportXmpp
git checkout testmaster
git merge masterProtocolSupport #merge approved branches
git merge masterFixSSLBug #merge approved branches

#this proves that a push to origin master will 
#fast-forward if you do it now.
#run tests before you do that!
#if it works, then merge this into master, push it, 
#and discard this branch when done.

So we merge protocol support and ssl fixes into our test branch after we squash the commits for our leaf branches to include commentary from the reviewer (who it was, etc), to move out irrelevant noise from the detailed original commits.  Because we always rebase before merge, all the merges should fastforward.  This will give us a very clean and easy to read git log.


We run our tests and see that they pass.  We can do this test before we ask for a code review of our leaf branches as well, but ultimately we need to run tests on the subset of branches that was approved to ensure that we don't push breakage into the repository when an unapproved branch gets left out.  After the tests pass push the work up,

#the merge into testmaster proved that the push would fast-forward if we did it.
git checkout master
git merge testmaster #merge what we tested, equiv to merging approved branches
git push origin master
#we don't need this branch now
git branch -d testmaster


If you work like this, then you won't be dealing with merge problems in git.  You won't be pushing broken commits up into the master branch for others to figure out (because you can actually integration test before you even push!).  You will still have to resolve conflicts when you rebase, but that is exactly where the resolution needs to be done.  There will still be unusual scenarios where you have to merge in both directions, when you are not dealing with a straightforward tree of branches around the master.  It is possible to use merges for everything, and there is no permanent damage to the repository if you make a mistake.

There will also be scenarios where you want to commit into a non-leaf node like masterProtocolSupport for the cases where the same merge conflicts happen in both leaf branches.  In that case, Xmpp and Gopher would treat masterProtocolSupport like the remote.  The same rebase before merge rule still applies.

There are also cases where you might not rebase from a remote, violating the rebase from root towards leaf rule.  You may have to merge when you have multiple origins (origina,originb) that are evolving independently (because you want to rebase on one consistent parent branch).