multiple remote branches in git-svn (the right way)


There are several pages out there that explain how to use git-svn with multiple remote SVN branches but not all of the branches in your SVN repository. I need this for MPICH2 where I have two problems:

  1. We have a non-standard repository layout. Normal SVN repos have $ROOT\_URL/{trunk,branches,tags} with branch and tag names under the branches and tags directories respectively. Instead of this, we have $ROOT\_URL/{trunk,branches/dev,branches/release,tags/dev,tags/release}. I believe that this was done because we had trouble with dev stuff cluttering up the branch and tag namespaces back when we were still in CVS. This isn’t a dealbreaker for git-svn, but it’s the first stumbling block when you are first getting started with it. If anyone needs help with this part, let me know and I can post what I use for this here.
  2. Most of the tags and branches are full of junk that I don’t care about. I really want all of the release branches and tags plus select dev branches and tags. Importing all this extra history takes forever and in some cases seems to confuse git-svn when it tries to determine the parent-child relationship for some of the branch points. This is the real non-starter for me. Also, while MPICH2 isn’t a small project, it isn’t a huge one either. There are less than 5000 changesets in the repo as of the time of this writing and a fresh SVN working copy takes up about 100 MiB on disk. I can’t imagine importing either all branches or no branches of a monster like KDE, for example.

So, what I needed was a way to fetch specific branches from my SVN repository in a fairly straightforward manner. If you google around some, you’ll generally find instructions that look like these. I’ve used that strategy and it works, but it requires you to explicitly fetch/rebase from the different svn-remote sections, which I don’t like. My approach is to add multiple “fetch” lines in the svn-remote section of your .git/config file. So if you start with this:

[svn-remote "svn"]
       url = https://svn.mcs.anl.gov/repos/mpi
       fetch = mpich2/trunk:refs/remotes/trunk

Add additional fetch lines like so:

[svn-remote "svn"]
       url = https://svn.mcs.anl.gov/repos/mpi
       fetch = mpich2/trunk:refs/remotes/trunk
       fetch = mpich2/branches/dev/threads:refs/remotes/threads
       fetch = mpich2/branches/dev/knem:refs/remotes/knem
       fetch = mpich2/branches/dev/pmi2:refs/remotes/pmi2
       fetch = mpich2/branches/release/mpich2-1.0:refs/remotes/mpich2-1.0
       fetch = mpich2/branches/release/MPICH2_1_0_8:refs/remotes/mpich2-1.0.8
       fetch = mpich2/branches/release/MPICH2_1_0_7:refs/remotes/mpich2-1.0.7

Then you could just “git svn fetch”. However, at least for MPICH2, that seems to take a really, really long time to complete. So I usually cheat and look up the oldest commit on the branch and start there:

% svn log --stop-on-copy \
   https://svn.mcs.anl.gov/repos/mpi/mpich2/branches/release/mpich2-1.0 | \
   egrep '^r[0-9]+.*lines$' | tail -n 1
r1037 | goodell | 2008-07-08 13:04:48 -0500 (Tue, 08 Jul 2008) | 3 lines
% git svn fetch -r 1037
... some git-svn output indicating that history is being stored locally...
% git branch -r
 knem
 mpich2-1.0
 mpich2-1.0.7
 mpich2-1.0.8
 pmi2
 threads
 trunk

If you run into trouble and your git-svn metadata doesn’t seem right, it seems to pretty much always be safe to blow away your git-svn metadata in the .git/svn directory and re-fetch. The re-fetch ought to go very quickly because the actual diffs are already stored in the git part of your repository, the correspondence between git and svn information just needs to be established. One thing that I have to admit I haven’t tested is whether or not I can git svn dcommit back to one of these remote SVN branches from derived local dev branches or if it’s going to try to apply the commits back onto the trunk. In practice this isn’t a big issue for me since I don’t actually use branches on the SVN server very much now that I have private git dev branches, but it will still come up a few times per year I’m assuming. If I can’t dcommit back then I will need to go back to the separate svn-remote sections for branches to which I need to commit.