how not to migrate from subversion to git

You know how you typically read blog posts of what to do that works. And not all the things people tried that didn’t work. This post is dedicated to what didn’t work.

Also see:

Don’t do this #1 – Migrate from a remote repository

Migrating from SVN to Git requires a large number of network roundtrips (for a large repository.) This slows things down greatly. It’s better to export/dump the repository and run everything locally.

See the main blog post for how to create a local dump/rep

Don’t do this #2 – Split the dump by project

I had the idea to split the SVN full dump file into smaller SVN dump files by project. I chose to preserve revision numbers and not use “renumber-revs”. We used the revision numbers in our release notes. Here’s a sample command:

svndumpfilter include "IntegrationTests" --drop-empty-revs < full.dmp 
  > project_IntegrationTests.dmp

We had one project that consists of the majority of the SVN code base (the forum software.) All of the tags were for this project. I thought to import this one as “full.dmp” and just delete the “trunk” projects afterwards for this one. That way I’d only be filtering the smaller/safer ones.

None of this was necessary! You can just point migration at the same full SNV dump with different paths to migrate projects into their own repositories.

Don’t do this #3 – Check out the entire repository including tags

Migrating using “git svn clone” requires an authors.txt to map SVN users to GitHub names/emails. I had the idea to check out the entire repository including tags and running svn log on it to get the committers. After 90 minutes, I gave up on this idea.

Don’t do this #4 – Assume that all authors/committers are people

There were a couple commits from Jenkins which seems reasonable. There were also a couple commits as “root”, “test” and other random users. Looking at the readme.txt from one of those commits, it looks like a command line import.

Don’t do this #5 – Guess at what should be in the authors.txt file

We have about 90 users in our authors.txt file. I thought I would save time by only putting the people I thought were committers in the authors.txt. This was a problem for a few reasons:

  • About 30 people committed to the main project
  • A few people committed who no longer have access to the code base.
  • We had some “funky” committers including “root” and “test”

This meant I kept running the “git svn clone” command, having it fail on missing users, adding them to authors.txt and resuming the run (re-running automatically resumes).

It would have better to us svn log on trunk to get all the authors or the –authors-prog flag to specify a command to fill in any defaults. This would have let me write “Unknown” for the funky ones and be done with it.

Don’t do this #6 – Make assumptions about project structure

At the top level, the repository had:

  • about 20 projects (directly at the root level, not under trunk)
  • a branches directory
  • a tags directory

I foolishly assumed that meant that the 20 projects had the code directly inside them. And sometimes that was true. However, for about 5 projects, there was a nested trunk/branches/tags structure under that project.

We all know that thing about standards. There are so many….

Don’t do this #7 – Migrate 300 large tags

This project uses Ant (and not Ivy) so there are a lot of jar files in the repository. This means tags are large. With just under ten thousand commits and just under 400 tags, this proved to be just too much.

Watching the “git svn clone” procedure, it goes through commit 1-n as it goes. This means the later commits/tags need to go through a large amount of work to make progress. Despite that, it was surprisingly linear.

After 12 hours, it had migrated 2700 commits and after 26 hours, it was up to commit 5446. At the 18 hour mark, it was up to commit 6926. (At the 24 hour mark, I decided to abandon this approach. I let it run until I needed to shut down my computer to see what would happen.)

Most of the wasted time was for the tags. Which in SVN are a copy. In Git, they are just a label so this is a lot of unnecessary duplication in a migration.

See another approach for migrating tags

What type is a var?

Java 10 introduced “var” where the type of the variable is implied. This leads to some tricky scenarios.

We first learn that “var” can replace the type. That means these two code blocks are equivalent.

int a = 9;
int b = a;
var a = 9;
int b = a;

Ok. So far so good. Now we have this code:

short a = 9;
short b = a;

So we substitute var and the code no longer compiles!

var a = 9;
short b = a;

What’s going on? Well, Java is only using the one line to figure out the type. Since int seems like a reasonable default, variable a is an int. Until of course, we get to the next line and it isn’t.

This would compile, but defeats the purpose of using var. So be careful!

var a = (short) 9;
short b = a;

Calling from an airplane

I wanted to listen to a phone call while I was in the air. I wasn’t sure if it would work, but worth a shot!

I tried Skype, but it dropped the call after just a few tries due to a poor network connection  While the JetBlue wifi isn’t weak (you can watch video on Amazon Prime), you arent supposed to be doing phone calls  so maybe they block it.

My second attempt worked.  I used wifi calling on my iPhone. It is off by default.  What I did:

  • turn on airplane mode (did before plane)
  • connect to jetblue wifi
  • go to flyfi.com in a browser and accept terms on service
  • go back to settings.  Go to the cellular section (you can leave cellular off to do this)
  • go to wifi calling and turn on
  • accept the two prompts

That’s it.  I was able to make a call from my phone.  After the call, I turned off wifi calling since im not familiar with the impact.  And it isnt as if I am running out of minutes!

Nite that you arent supposed to make phonecalls in the air lest it annoy your neighbors.  However, listening to a call is like listening to a podcast.  I’m on mute the whole time  and my headset doesnthsve a mic anyway.