It’s Git to be magic!

To help me manage the source code for Pony Express I use the Git distributed version control system (DVCS). A DVCS is a tool used to record and mange the changes (revisions) made to source code over time.   It also allows many developers to work on the same project concurrently; each developer works on a ‘copy’ of the source files and the DCVS manages the merging of changes between the copies.

Prior to developing Pony Express I had not used Git a great deal and never for a real project.  I wanted to learn more about how to use it, so decided I would learn about it ‘on the job’, as it were.   Up until a few days ago I was muddling along with it quite well, I knew enough to be able to use the more basic commands, git commit, git log, git diff, git status, git branch and so on, but I knew I was missing something that would make it so much easier…

It was then that I discovered the power of git rebase!  This git command is very powerful and can do several very useful things, two of which I will discuss here.  The first function I will cover is squashing multiple commits into one large commit, the second function is its main use, which is to merge two branches that have both diverged since one was forked from the other (or ‘Forward-port local commits to the updated upstream head’ as the man page puts it).  I found the man page pretty impenetrable, which is why I found out about this command only recently, and part of the reason for writing about git rebase here.

Squashing multiple commits into one.

My first use for git rebase was to squash multiple commits into one larger commit.  I frequently came across a situation where I would have uncommitted changes in one branch and would need to switch to another branch to do a quick bug fix or something.  The problem here, is that uncommitted changes are carried across into the other branch when you do a git checkout <branch>; which is not what I want.  I had been using git stash which temporarily ‘stashes’ uncommitted changes allowing me to change branches and carry out the bug fix.  I would then recover the stashed changes using git stash apply when I came back to the original branch.

A much better solution would be to commit the changes before moving to the other branch, however, I don’t want lots of small partial commits swamping the log.  The answer to this is to use git rebase in interactive mode to squash the many partial commits into a larger commit.  Only the single larger commit would then show up in the log.  So how does this work?

Here we have commits A, B and C on the master branch, commit D on the branch B1 which we are on (indicated by *).  Commit D represents the early work on a new feature, which we are continuing with.  However, we wish to move to branch B2 to fix a small bug or something.

So we do git commit -a to commit the latest changes as commit E, and git checkout B2 to switch branches.

We can now work on branch B2 and fix the bug, committing the fix as commit F.

We need to finish the work on branch B1, so we do a git checkout B1, complete our changes and commit the final work as commit G.

We now have 3 commits, D, E and G that we would like to squash into one commit with all the changes for the new feature.  To do this we call git rebase with the -i option to start it in interactive mode, we also need to tell git rebase the last commit we want to stay as is, for us this is Commit C.  Commit C can be referenced  as ‘HEAD~3’ (likewise, D is HEAD~2, E is HEAD~1 and G is HEAD) so the command git rebase -i HEAD~3 starts up an editor for us and lists the commits D, E and G together with their log messages like so:

pick D Begun implimentation of cool new feature.
pick E Halfway finished cool new feature.
pick G Finished cool new feature.

#Rebase C..G onto C
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

As you can see it is also possible to edit and reword commits, but I won’t go into that here. We want to squash commits D, E and G together so we need to replace the word ‘pick’ from commits E and G with ‘squash’, to get this:

pick D Begun implimentation of cool new feature.
squash E Halfway finished cool new feature.
squash G Finished cool new feature.

#Rebase C..G onto C
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

We then exit the editor and save the changes, git then starts the editor up again to allow us to modify the log message for the single large commit, which starts out as an amalgamation of the individual commit log messages, like this:

# This is a combination of 3 commits.
# The first commit's message is:
Begun implimentation of cool new feature.

# This is the 2nd commit message:
Halfway finished cool new feature.

# This is the 3rd commit message:
Finished cool new feature.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# Currently on branch B1.
# Changes to be committed:
# (use "git reset HEAD ..." to unstage)
[snip..]

You modify this message to say whatever you want about the new big commit and then exit and save as before.

Then as quick as a flash, git merges all the changes made in those three commits into one commit and then attaches that to the HEAD of the branch B1 to give this:

Commits D, E and G have merged into new commit H.

As you can see, this makes it very easy to make many commits on a branch and then to parcel them up when necessary, into more meaningful larger commits.  This enables you to change branches with ease, and to work on many areas of code at the same time, without those changes creating utter chaos in your code.  I know this certainly aids my productivity and keeps my confusion levels down 🙂

Now that we have looked at one way to use git rebase, let us turn our attention to the main function of the command.

Forward-port local commits to the updated upstream head.

What this gobbledygook means is that git rebase can merge commits from one branch into another branch behind any commits you have made on the branch you are working with. This is made clearer with a diagram:

Here is the same commit tree as used earlier after some more work has gone on; commit F (a bug fix) was merged with the master branch and another commit, I (another developers cool new feature), has also been merged in.  You are working on branch B1 and have made commits H, J and K (your new killer feature) and you would like to have the changes in the master branch in your branch too.   Using git merge in this situation would cause the changes made in commits I and F to be added to the HEAD of branch B1, ie: after your commits in history. This then gets messy when it comes to pushing your changes into the master branch at a later date. A simple git rebase master is the magical incantation you want to use in this situation. This rebases the branch B1 so that it diverges from branch master after commit I, instead of commit C as it did before.

The way it does this is to ‘undo’ all the changes in commits H, J and K on branch B1 and saves them in a temporary area.  It then resets B1 to be the same as master and then applies commits H, J and K to the new HEAD.  Now this process isn’t always strait-forward; it is possible that there may be conflicts when commits H, J and K are applied.  If this is the case then you will need to resolve those conflicts manually in the usual manner and then run git rebase --continue to complete the rebase.

Using git rebase like this means that when the time comes to push your changes into the master branch, they should go in with the minimum of conflicts, as your branch has been kept up to date with master.

Well that is it for this introduction into the power and magic of git rebase.  I hope that it is of use to someone out there grappling with the intricacies of git.

Advertisements

About Paul Elms

Stay at home Dad, ex-biologist, wannabe Android app developer.
This entry was posted in Development, git. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s