Git: A review by a not-gonna-use-it-er

This is in response to Drave and Xiper’s questions about my nixing of Git as an option for a replacement version control system.

Git” is an incredibly popular distributed version control system originally developed by Linus Torvalds, creator of Linux. The fact that it is used by the Linux Kernel development team has given it a firmly established base in Linux development. Add in the “github” project hosting service and it has stealthed into the position of dominant “revision tool thingy” that RCS once held.

Mercurial” is another popular distributed version control system, lurking in the shadows.

So why did I nix Git and start looking more at Mercurial?

Well, first: why am I looking beyond Subversion?

Foremost: Because we’re not checking in as regularly as we should, we’re not branching on the fly like we should, and because our Subversion repository is woefully unrelated to our issue tracking (trac) system.

The nature of our cross-platform, networked, distributed project is essentially “brancherific”. It doesn’t matter that a branch is an atomic operation on a subversion server. What matters is how long it takes to lug around a snapshot of our project repository.

(Note: A fair chunk of that is Subversion overhead)

So we wind up doing sloppy checkins with lots of stuff bundled into a single commit. (Ramp, for instance, often goes weeks without pulling or committing).

When things go wrong, this makes it incredibly difficult to pull specific changes or patches back out. You have to find the “mega-commit” that includes the offending alterations, and then cherry pick the undesirable components out of it =(

Subversion also doesn’t have particularly robust merge tracking yet, so if you don’t have a rigorous development workflow practice with merges always happening in controlled directions, it’s very easy to get Subversion into a state where it can’t reverse-merge revisions… (*adjusts collar and hopes none of his colleagues read this*)

Finally, our SVN repository is hosted on a Windows 2003 machine (high end dual quad core Xeons) running VisualSVN server. For all that VisualSVN has done for us, its main shortcoming is that it is limited to http:// and https:// protocols, which are drastically slower than Subversion’s native svn:// protocol. It’s not unusual to see TortoiseSVN reporting a transfer rate of a few kb/s. If we’re lucky, we can get 120Kb/s across the local 100Mb/s ethernet on large binary files.

(From a lower-spec’d machine running svnserve, we can expect to see 2-8Mb/s instead of Kb/s; we would probably be better accessing the repository across a Windows mount for full speed throughput)

Enter Distributed Version Control Systems

(Subtext: Roll back to the original RCS style concept)

Before Subversion was CVS, which was descended from “RCS”. RCS was a really simple system for managing changes to files. You could use the revision history to generate patch files that could be sent to other people to merge changes.

Distributed Version Control Systems are basically a throw back to this basic concept: What you have on your hard disk is your repository. With RCS, the task of trading patches is left to human operation through the “patch” command or external tools. With a DVCS this is an integral feature.

When you commit with SVN, local changes are sent to a central repository-server, and immediately become part of the central repository.

When you commit with a DVCS, it updates your local revision history only. This generally confuses users of things like CVS, SVN, VSS, Perforce etc because they check stuff in and nobody else can see it.

With a DVCS, any master/central repository is purely by user nomination. For example, Linus Torvalds’ “official Linux Kernel” repository is the official repository purely because people agree it is.

The core concept behind both Git and Mercurial, then, is that you are primarily working locally and possibly, periodically, exchanging change sets with other users.

With either system, a development team like ourselves can appoint a central repository as the master, to which we push completed change sets from our local machines, and from which build tools pull source for release bundles.

Both systems perform branches by deriving a clone of your current working copy/repository, and making it a parent. You can then hack away at that branch, get the work completed, and then “push” the changes back up to your parent repository or the master.

They both also use the concept of uniquely identified “changesets”, which appears to just be a fancy term for “a patch including revision history”.

Git vs Mercurial

Both Git and Mercurial are well supported, popular and powerful. There are Tortoise Windows interfaces for both (TortoiseGit and TortoiseHg).

Git takes a modular approach; a real chip off the ol’ Unix block. Behind the “git” command front-end, it is built on over 100 small modules: git-checkout, git-pull, git-merge, etc… This allows it to be used in wildly creative ways with all the power and flexibility you’ve gotten used to through Unix shell scripting.

On the down side, git is built around over 100 small modules replete with command line switches and arguments that often have to be used in wildly creative ways to achieve all the simple and mundane tasks you’ve gotten used to doing with a right click through GUI interfaces.

Git’s documentation royally sucks. It blows. Unfortunately git is in a state of ongoing hyperactive development that renders wiki pages and blog annotations incorrect before they’re posted. (Edit: due credit, git commands tend to provide verbose feedback when they encounter anomalous behavior or unexpected input – see comments)

We only have one experienced Git user: ahwulf, and his experience wasn’t good.

Git’s commands are almost deliberately anti-CVS styled, making them strange and confusing for old Unix hackers familiar with SVN/CVS/RCS.

For example: The unwary CVS migratee attempting to get rid of changes he does not want might try to ‘revert’ a local copy only to find that it checks something in. In an effort to protect you from possible stupidity, git records the changes and their negation, so that you could undo this action and get back what you were doing. If you absolutely, positively want to be rid of someone typing “penis” into your loading screen code because you left your screen unlocked for 30 seconds, ‘git reset –hard HEAD^’ is what you want.

Mercurial takes ye-olde monolithic app approach, with most of the functionality built into the main executable, with a few small extras and a plugins system. It’s also written in Python rather than C/C++, so it’s a wee little bit slower. The commands are more CVS/Subversion migration friendly, and there are some pretty nice tools for migrating from SVN -> Mercurial. Heck, the TracOnUbuntu Wiki page suggests Mercurial as an alternative, and Trac is basically a subversion front-end :)

There is a tool called “hgsvn” (‘Hg’ being the symbol for the element Mercury) which lets you use Mercurial to manage your local revisions to an svn checkout, import revisions from the repository and commit your Mercurial changesets back to the subversion repository.

Mercurial also has a built-in “convert” feature.

Ultimately… Is Mercurial better?

Alas, that is something of a religious question.

Mercurial is simpler, easier to migrate to, and could be described as “more manageable” depending on your degree of laziness.

Git is more comprehensive and vastly “more manageable” if you are enough of a power user/shell hacker. It can fill an awful lot more niches due to it’s shell-like flexibility. It is in extremely widespread usage, although I would question what percentage of “users” are actually working with it as opposed to copying and pasting “git clone” commands from github links.

In summary:

Git: The dog’s bollocks, and if you’re working on a very active open source project with a large community of contributors, Git is probably the way to go. GitHub is unarguably one of the finest ways for package/bundle distribution, even more awesomely so for LAMP/Rails platforms, Ruby Gems, Python/Perl modules, etc, etc.

Edit: Git should probably be your choice if you are going to establish an editorialized workflow (peer-review advance, patch committee, dictatorial patch selection).

Mercurial: The migrant’s friend.

My take was that Git is too distributed, or at least too piecemeal. Mercurial offers a path of less resistance for a migration, it is well supported by all the tools we use that currently tie to Subversion, and it is better documented and has less management overhead.

See also: Git vs. Mercurial.

5 Comments

Yep, exactly so – we use svn lots (mostly configuration and artifact management rather than source code right now) and it’s annoying to branch/merge.

Downside with git and mercurial is that binaries are stored as is without binary diffs (ok, so it’s just disk space but still)

I’m still on the fence to whcih version to use, but mercurial is winning for me with the syntax – and being able to run both concurrently makes migrations so very much simpler.

I used Accurev at my last company, and the ease of check-ins subconsciously pulled me into checking in every minor change that I made. Dead simple to back out. Now I’m using an ancient source control system and the difference in check-in rate is dramatic.

Thank you KFS1! Great article!

osmith@ubuntu:~/bin$ git commit
# On branch master
# Changed but not updated:
# (use “git add …” to update what will be committed)
# (use “git checkout — …” to discard changes in working directory)
#
# modified: buildpatch
# modified: make-host-readme.pl
# modified: punt
#
# Untracked files:
# (use “git add …” to include in what will be committed)
#
# buildpatch.loc
no changes added to commit (use “git add” and/or “git commit -a”)

(or git commit .)

Ok – minor deviation from what you’d expect. But Mercurial defaults to “commit” checking in everything in the repository and not, by default, the current directory unless you do ‘hg commit .’.

Unfortunately, when I “push” this to the master repository, I have a bugger of a time remembering how to move the changes from the “changeset” to the visible working-copy of the repository. Grr.

Added the following note on Git:

Git should probably be your choice if you are going to establish an editorialized workflow (peer-review advance, patch committee, dictatorial patch selection).

Git works superbly for the Linux Kernel Team, where Linus is the master of all changes, but has delegated modules to sargents who supervise incoming changes to their specific domain.

Mercurial emphasizes co-operative development a little more, although you can effectively achieve the same workflow process by other means (such as file system access control preventing other people from checking stuff in to the location of your parent repository).

Mercurial also has strong support for ‘quilting’, or mutable changesets. To quote the wiki entry:

Mq for the impatient

It’s one of those things that sounds a lot harder than it is. It’s basically just orthogonal, mutable changesets. The mutability is exactly what it sounds like you’re looking for. You can keep revising a changeset until it’s good enough, then transfer control to regular Mercurial.

Leave a comment

Name and email address are required. Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong>