Puppet: System Administration Automated

Distributed Version Control


The question of distributed version control came up on the mailing list again when I said I wanted to switch Puppet's main development to one of the available tools, and Jason Kohles was concerned that it adds too much complexity without adding much functionality.

Because this comes up often, I figured it made more sense to turn my response into a blog post rather than a fleeting list message.

There are two primary problems I hope to solve by switching to a distributed SCM:

Pre-Branching

Probably the most important problem that SCMs solve in Puppet's current development environment is that it automatically pre-branches on every checkout, so you can just start developing and not have to worry about whether the work you're doing deserves a branch or not. I do a lot of work that should committed in multiple steps but can't be because it would leave trunk in an inconsistent state. Yes, I should create a branch in that case but I often don't realize that a branch is needed until I'm half done with the development, and Subversion just isn't set up to deal well with mid-development branching.

With a distributed SCM, every checkout is a branch, so I can make as many commits as I want and merge them in the end. This is great because it means that branching doesn't need to be planned out.

Lower Barrier of Development

Distributed SCMs also really encourage more development. With a centralized system, even if you have commit access you're going to tend not to experiment much because you will always be concerned about leaving the system in an unstable state. Of course, you could always make a branch, but if you just want to experiment it often doesn't seem worth it. Or, even worse, you'll begin an experiment, find out halfway through that it's worthwile, and then not be in a position to make a branch because you've done so many moves and copies.

Puppet development shouldn't require my permission just because the work spans more than one commit. If you think you can do great development, then you should be able to do it and just send me a diff. Even if you don't want to contribute the source back, or just not yet, it should be easy to develop over multiple commits.

I've even had this problem with people who have commit access -- they're working on a project that's big enough to span multiple commits but they forgot to branch initially or underestimated the amount of work involved. As a result, they are unwilling to commit until all of the work is done, which means not only that they're susceptible to data loss (one collaborator almost lost all of his work because of a hard drive failure) but that I can't easily get a mid-stream idea of what they're doing.

Other Reasons

Even if there weren't any other developers, I think I'd still want a distributed SCM. I seem to fly a lot these days (so much that American Airlines put me in their Gold club or whatever it is), and I find that I can usually get a lot of development work done on the plane. However, more than once I'm doing work that should span multiple commits or I've managed to destroy my checkout because I'm doing things that Subversion can't revert from a working copy, like moving directories around. Every time I'm on a plane, I really want a dSCM.

Further Reading

Mark Shuttleworth has had a few posts on this topic recently, and they're worth looking through, especially the post on merging.

Conclusion

For me, the real question comes down to what is the unit of development? Is it a commit, or does it often span multiple commits? I think it often spans multiple commits, so we need a tool that makes that easy.

Jason's original concern was over ease of use, but for trivial cases dSCMs are as easy to use as centralized tools -- you just check out, make your changes, and email a diff. The big benefit of the dSCM is that there is so much more room for complexity, even if you don't need it.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 09 Jul 2007 | Tags: , , , , , , ,