Josh Susser's blog
A recent addition to Jeff Dean's auto_tagger is the ability to use an alternate tag ref type instead of standard git tags. I pestered Jeff to add this feature after Scott Chacon explained to me how bad it is to create a large number of tags in git. Thanks, Jeff, for the feature addition!
There are a couple problems with generating autotags as standard git tags. One is that it pollutes the tag namespace which makes it harder to find tags for releases, etc. And it defeats the tags menu in the GitHub UI. The other is that git will automatically sync tags on every fetch and push, which can noticeably slow things down when you have a lot of tags. And it looks like running GitX with thousands of tags can make the app seriously slow and prone to crash.
A little background: A git tag is just a kind of ref in a special namespace. A ref is a file that contains a SHA-1 has identifying a commit, and is an entry point into the big network of blobs in the git object database. It's quite easy to create new kinds of refs; you just put a new directory in the .git/refs dir and go from there. auto_tagger will now do this for you simply by adding one configuration option.
Drop a .auto_tagger options file into your project with these contents:
--ref-path=autotags
auto_tagger will automatically fetch and push tags in a custom namespace when it needs to. You almost never need to look at autotag refs in development, but if you do, you may need to fetch them manually. That's part of the point of using an alternate tag type, avoiding syncing them automatically on every fetch. To manually fetch all the autotags (when using the ref-path=autotags option as above), do
$ git fetch origin refs/autotags/*:refs/autotags/*
I also really like the auto_tagger option to format tag names so they are human readable timestamps by adding a separator character. To make that work, set the date-separator option. Your .auto_tagger options file should look like:
--ref-path=autotags
--date-separator=-
Then your autotags look like "ci/2010-10-09-00-56-21" instead of "ci/20101009005621"
When you're moving a codebase from subversion to git, here are a few things that make the move go a bit more smoothly.
In the svn project, you can discover some things you'll need to adjust in git after the import.
Show all files being ignored
svn propget -R svn:ignore .
Add these files to the .gitignore file at your project root, or in appropriate subdirectories. I prefer keeping it all in one place at the top level.
Show all externals
svn propget -R svn:externals .
You'll either have to switch to using a submodule in git, or just pull the files into your project if that's not possible for some reason.
Find all empty directories
find . -type d -empty
touch /path/to/empty/dir/.gitkeep
Since git doesn't keep empty directories, you can add a .gitkeep file to empty directories that you don't want to go away. Some people add a .gitignore file to keep the directory around, but that sounds totally backwards to me. You want to keep it, not ignore it.
By the way, if you are already ignoring dir/*, that will ignore the .gitkeep file as well. Make sure it isn't missed by adding !.gitkeep to the end of your .gitignore file.
Find all authors
If you want to properly attribute commits, you'll need to set up an authors file. But if you miss any authors, the clone will stop and complain. You can discover all the svn users that you need to put in the authors file with this command:
svn log | grep -E 'r[0-9]+ ' | cut -d\ -f3 | sort | uniq
init + fetch > clone
If git svn clone doesn't complete, try doing the init/fetch as separate operations. The clone subcommand is pretty much just doing an init followed by a fetch, but I've found that if the clone fails, doing the commands separately can have better success.
