Git/Branching & merging

Branching is supported in most VCSes. For example, Subversion makes a virtue of “cheap copying”—namely, that creating a new branch does not mean making a copy of the whole source tree, so it is fast. Git’s branching is just as fast. However, where Git really comes into its own is in merging between branches, in particular, reducing the pain of dealing with merge conflicts. This is what makes it so powerful in enabling collaborative software development.

Why Branch?
There are many reasons for creating multiple branches in a Git repo.
 * You may have branches representing “stable” releases, which continue to get incremental bug fixes but no (major) new features. At the same time, you may have multiple “unstable” branches representing various new features being proposed for the next major release, and being worked on in parallel, perhaps by different groups. Those features which are accepted will need to be merged into the branch for the next stable release.
 * You can create your own private branches for personal experiments. Later, if the code becomes sufficiently interesting to tell others about, you may make those branches public. Or you could send patches to the maintainer of the upstream public branch, and if they get accepted, you can pull them back down into your own copy of the public branch, and then you can retire or delete your private branch.

You may, in fact, want to add updates to different branches at different times. Switching between branches is easy.

View your branches
Use  with nothing else to see what branches your repository has: $ * master The branch called "master" is the default main line of development. You can rename it if you want, but it is customary to use the default. When you commit some changes, those changes are added to the branch you have checked out - in this case, master.

Create new branches
Let's create a new branch we can use for development - call it "dev": $ $    dev * master This only creates the new branch, it leaves your current HEAD where you remain. You can see from the * that the master branch is still what you have checked out. You can now use  to switch to the new branch.

Alternatively, you can create a new branch and check it out all at once with $

Delete a branch
To delete the current branch, again use git-branch, but this time send the  argument. $

If the branch hasn't been merged into master, then this will fail: $ error: The branch 'foo' is not a strict subset of your current HEAD. If you are sure you want to delete it, run 'git branch -D foo'.

Git's complaint saves you from possibly losing your work in the branch. If you are still sure you want to delete the branch, use  instead.

Sometimes there are a lot of local branches which have been merged on the server, so have become useless. To avoid deleting them one by one, just use:

git branch -D `git branch --merged | grep -v \* | xargs`

Pushing a branch to a remote repository
When you create a local branch, it won't automatically be kept in sync with the server. Unlike branches obtained by pulling from the server, simply calling git push isn't enough to get your branch pushed to the server. Instead, you have to explicitly tell git to push the branch, and which server to push it to:

$

Deleting a branch from the remote repository
To delete a branch that has been pushed to a remote server, use the following command:

$

This syntax isn't intuitive, but what's going on here is you're issuing a command of the form:

$

and giving an empty branch in the  position, meaning to overwrite the branch with nothing.

Merging
Branching is a core concept of a DVCS, but without good merging support, branches would be of little use.

This command merges the given branch into the current branch. If the current branch is a direct ancestor of the given branch, a fast-forward merge occurs, and the current branch head is redirected to point at the new branch. In other cases, a merge commit is recorded that has both the previous commit and the given branch tip as parents. If there are any conflicts during the merge, it will be necessary to resolve them by hand before the merge commit is recorded.

Handling a Merge Conflict
Sooner or later, if you’re doing regular merges, you will hit a situation where the branches being merged will include conflicting changes to the same source lines. How you resolve this situation will be a matter of judgement (and some hand-editing), but Git provides tools you can use to try to get an insight into the nature of the conflict(s), and how best to resolve them.

Real-world examples of merge conflicts tend to be nontrivial. Here we will try to create a very simple, albeit artificial, example, to try to give you some flavour of what is involved.

Let us start with a repo containing a single Python source file, called test.py. Its initial contents are as follows:


 * 1) !/usr/bin/python3
 * 2) This code doesn't really do anything at all.
 * 1) This code doesn't really do anything at all.

def func_common pass
 * 1) end func_common

def child1 func_common
 * 1) end child1

def child2 func_common
 * 1) end child2

def some_other_func pass
 * 1) end some_other_func

Commit this file to the repo, with a commit message saying something like “first version”.

Now create a new branch and switch to it, using the command

(This second branch is to simulate work being done on the same project by another programmer.) Edit the file test.py, and simply swap the definitions of the functions child1 and child2 around, equivalent to applying the following patch:

diff --git a/test.py b/test.py index 863611b..c9375b3 100644 --- a/test.py +++ b/test.py @@ -7,14 +7,14 @@ def func_common pass -def child1 -   func_common -#end child1 - def child2 func_common +def child1 +   func_common +#end child1 + def some_other_func pass
 * 1) end func_common
 * 1) end child2
 * 1) end some_other_func

Commit the update to the branch side-branch with a message like “swap a pair of functions around”.

Now switch back to the master</TT> branch:

This will also put you back to the previous version of <TT>test.py</TT>, since that was the last (in fact only) version committed to that branch.

On this branch, we now rename the function <TT>func_common</TT> to <TT>common</TT>, equivalent to the following patch:

diff --git a/test.py b/test.py index 863611b..088c125 100644 --- a/test.py +++ b/test.py @@ -3,16 +3,16 @@ #- -def func_common +def common pass -#end func_common +#end common def child1 -   func_common +   common def child2 -   func_common +   common def some_other_func
 * 1) This code doesn't really do anything at all.
 * 1) end child1
 * 1) end child2

Commit this change to the <TT>master</TT> branch, with a message like “rename func_common to common”.

Now, try to merge in the change you made on <TT>side-branch</TT>:

This should immediately fail, with a message like Auto-merging test.py CONFLICT (content): Merge conflict in test.py Automatic merge failed; fix conflicts and then commit the result.

Just to check what reports:

On branch master You have unmerged paths. (fix conflicts and run "git commit") Unmerged paths: (use "git add ..." to mark resolution) both modified:     test.py no changes added to commit (use "git add" and/or "git commit -a")

If we look at <TT>test.py</TT> now, it should look like


 * 1) !/usr/bin/python3
 * 2) This code doesn't really do anything at all.
 * 1) This code doesn't really do anything at all.

def common pass
 * 1) end common

<<<<<<< HEAD def child1 common
 * 1) end child1

=
>>>>>>> side-branch def child2 common
 * 1) end child2

def child1 func_common
 * 1) end child1

def some_other_func pass
 * 1) end some_other_func

Note those sections marked “&lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD” ... “=======” ... “&gt;&gt;&gt;&gt;&gt;&gt;&gt; src-branch”: the part between the first two markers comes from the HEAD branch, the one we are merging onto (<TT>master</TT>, in this case), while the part between the last two markers comes from the branch named src-branch, which we are merging from (<TT>side-branch</TT>, in this case).

Assuming we know exactly what the code does, we can carefully fix up all the conflicting/duplicated parts, remove the markers, and continue the merge. But perhaps this is a large project, and no single person, not even the project leader, fully understands every corner of the code. In this case, it is helpful to at least narrow down the set of commits that lead directly to the conflict, in order to get a handle on what is going on. There is a command that you can use,, which is designed specifically to be used during a merge conflict, for just this purpose. In this example, I get output something like this:

$ commit 9df4b11586b45a30bd1e090706e3ff09692fcfa7 Author: Lawrence D'Oliveiro <ldo@geek-central.gen.nz> Date:  Thu Apr 17 10:44:15 2014 +0000 rename func_common to common commit 4e98aa4dbd74543d7035ea781313c1cfa5517804 Author: Lawrence D'Oliveiro <ldo@geek-central.gen.nz> Date:  Thu Apr 17 10:43:48 2014 +0000 swap a pair of functions around $

Now, as project leader, I can look further at just those two commits, and figure out that nature of the conflict is really quite simple: one branch has swapped the order of two functions, while the other has changed the name of another function being referenced within the rearranged code.

Another useful command is, which shows a 3-way diff between the state of the source file in the staging area, and the versions from the parent branches:

$ diff --cc test.py index c9375b3,863611b..088c125 --- a/test.py +++ b/test.py @@@ -3,18 -3,18 +3,18 @@@ # This code doesn't really do anything at all. #- --def func_common ++def common pass --#end func_common - - def child2 -    func_common - #end child2 ++#end common def child1 --   func_common ++   common #end child1 + def child2 -   func_common ++   common + #end child2 +   def some_other_func pass #end some_other_func $

Here you see, in the first two columns of each line, “+” and “-” characters indicating lines added/removed with respect to the two branches, or a space indicating no change.

Armed with this information, I can approach the problem of fixing up the conflicted file with a bit more confidence, creating the following merged version of <TT>test.py</TT>:


 * 1) !/usr/bin/python3
 * 2) This code doesn't really do anything at all.
 * 1) This code doesn't really do anything at all.

def common pass
 * 1) end common

def child2 common
 * 1) end child2

def child1 common
 * 1) end child1

def some_other_func pass
 * 1) end some_other_func

Just to recheck, after doing  on the above fixed version, but before committing, do another , which should produce output like:

diff --cc test.py index c9375b3,863611b..088c125 --- a/test.py +++ b/test.py @@@ -3,18 -3,18 +3,18 @@@ # This code doesn't really do anything at all. #- --def func_common ++def common pass --#end func_common - - def child2 -    func_common - #end child2 ++#end common def child1 --   func_common ++   common #end child1 + def child2 -   func_common ++   common + #end child2 +  def some_other_func pass #end some_other_func

And what does  say?

On branch master All conflicts fixed but you are still merging. (use "git commit" to conclude merge) Changes to be committed: modified:  test.py

Now when you do like it says and enter, Git automatically finishes the merge.

“The Stupid Content Tracker”
The git(1) man page summarizes Git as “the stupid content tracker”. It is important to understand what “stupid” means in this case: it means that Git does not use elaborate algorithms to try to automatically handle merge conflicts, instead it concentrates on displaying just the relevant information to help human intelligence to resolve the conflict. Linus Torvalds has famously said that he wouldn’t trust his code to such elaborate merge conflict-resolution systems, which is why he deliberately designed Git to be “stupid”, and therefore, reliable.

Git/Branch