Monday, June 13, 2011

Subversion branching strategies

For the longest time I've been fighting a battle I cannot win with people trying to use Subversion's branching mechanism the improper way. In this installment I'll try to lay out some of the "proper" ways and underline why doing it any other way causes issues that are far more troublesome than the alternative.

Usually when you do branching without any version control system what you end up with is a separate folder with your sources containing a sort of "soft-frozen" version of your project. The "soft-" part comes from the fact that you actually can (and probably will) modify those sources. In this case what you're creating is a so called release branch. At the end of the development cycle you usually create a ZIP archive, give it a solid name (like my-cool-app-1.0-20110613.zip) and burn it on some DVD or something else that's meant to last forever. This is a "tag" meaning a "hard-frozen" version of your project.

There's a second case when you might want do create a copy of your project (read: branch it) that comes quite often. Imagine you're about to do some spike and don't want your regular development to go to pieces. What you end up with is something called "feature branch".

Now let's examine how those branching strategies are implemented in Subversion.

Branches in subversion are copies of some other locations. Same thing for tags. End of story :)

A release branch is usually created to make sure you include only those features and fixes your clients should have in a particular version. In this case it makes perfect sense to name your branch with the partial version number, for example 1.0.x, where the 1.0 is the actual major version and the .x part is the release that didn't happen yet. So when you do release you substitute the last part with some incremental number, by convention starting with 0 and incremented by 1, for example 1.0.0, 1.0.1 and so on.

When your main development is being done on trunk only the things that should go into the next release is being merged from trunk to release branch. So the direction changes are traveling is obvious: trunk -> release branch and while releasing they end up in a tag.

With feature branches the case is different. When you do the work you're supposed to do (be it a fix for something or a new cool feature for your project) you first integrate the latest changes from trunk into your feature branch making it "compatible" with the rest of the system (you can actually and should do that periodically) and after you're done your feature branch is the equivalent of the trunk + the feature you were implementing. At this stage when you're ready with it you should "reintegrate" your feature branch with trunk and close the feature branch.

The good news about what I described above is that Subversion helps you out every step of the way by keeping the so called merge information and not allowing you to merge the same commits twice by mistake. So when you're integrating changes from trunk into your feature branch you can do the same command over and over again and nothing will go wrong. The key here is to remember that you need not to do the classical merge when reintegrating changes from feature branch into trunk but to use a feature called "reintegrate". What it does is it takes the diff between trunk and branch and applies it to trunk disregarding merge information all together.

Same thing goes for release branches but here you'll most probably merge only selected changes to the release branch which in turn makes the merging process both extremely easy and painful when your release branch contains release-specific modifications for the purpose of integration with new patches. This is the first sign you should finally stop the branch and create a new one!

So what happens when you do the wrong thing? And what is the wrong thing you can do when branching in Subversion anyways?

Imagine you led the project on trunk for the most part. Then at some point you've done a branch and from that moment on you stopped development on trunk all together, probably due to the fact that you want your trunk to be stable at all times. This is silly and immature point of view knowing that most of the time the only branch that's going to be CI-tested is trunk so 99% of the job you're doing is unchecked. That leads to cranky developers, hard feelings, bad language.. You get the idea :/

The worst part of it is that there might be more than one part of your team working on more than one so called branch at the same so called time. This is where things start to be reaaaaly interesting: people try to merge months of work with trunk, that works only for the first team because they have had the least difference when they started, the second team is way of course and spends long days, possibly even weeks trying to figure out what the other team did just to make the application work again. And in the mean time because the first team was so damn productive they introduce yet another refactoring to the mix and others are left with their pants down. Subversion does not understand properly what came from where, strange merge errors occur, heads are rolling, hard feelings, bad language, the usual...

Another thing that's not closely related to Subversion as a system but as a bad practice is not to have CI on trunk or not to have CI at all (wrrrrrrrrr... that gives me goosebumps)! Man you just don't think about it - you just do it and get over it that the CI server tells you that the build has been broken. Get over it! This kind of police man in your project is the good kind of cop.

So to summarize:

1. Do release branches, merge selected revision from trunk to the release branch as needed and from them freeze tags.
2. Do feature branches whenever you're doing anything that requires more than an hour of work. Before reintegrating to trunk bring your branch up to date with the latest and reintegrate. After that there's no going back to the branch and you should keep that as the hardest rule of all.
3. When your CI server tells you that you've fu..ed up the compilation process or tests nothing in the whole world is more important than fixing the build problem because there might be tons of people depending on you doing a good job in not messing up their life.
4. Never ever allow your project to come to a point when 2 huge branches need to be integrated at once. This causes you problems you can't possibly foresee early. And you will pay for it dearly!
5. This is the coolest one of all: Don't create a branch of a branch of a branch. Just don't do it. You'll not be able to integrate the changes made on those things (however you want to call it) ever!

So my friends, that's it. 5 simple rules to follow, right? Actually no. There's nothing more powerful than your experience and your understanding of what you're trying to accomplish. If you're ok with doing what I so harshly criticized in pt 4 then go ahead! Maybe doing feature branches isn't all that necessary? Maybe you'll prefer often failing builds instead and make it part of your working cycle to deeply care what Hudson or CruiseControl have to say?

However I am confident that if you'll follow those 5 simple rules and give Subversion a chance to help you out you'll be in a much peasant situation than those gurus out there that keep saying "Subversion has a poor merging mechanism" or worse.

Happy branching!

No comments: