Sunday, January 23, 2011

Why Rewriting History Is A Good Thing

Part of the whole Git vs. Mercurial debate revolves around the issue of rewriting history.

Git lets you change your code's history after the fact, and it lets you do this very easily. Mercurial doesn't, and Mercurial people seem to think this feature isn't very important.

The misconception seems to be that the only reason to rewrite the source code's commit history is to hide the errors that were made in the past. In other words - ego. While that may be part of it - I can certainly imagine someone wanting to contribute code to the Linux kernel and wanting his patch to appear as if it was written by Jesus himself (correct from the start and with no back tracking to fix bugs) - that's certainly not the whole story.

Much more important, in my opinion, is the signal-to-noise ratio of code and code-history.

We treat code with the utmost respect. For example, it is widely considered very poor form to keep around commented out code in your source file, especially if it's obsolete or incorrect. We want the people reading our code to not be distracted by irrelevant information.

These days, the SNR aspect of code applies more and more to the code's history as well. People review each other's code and need to be able to know that the changes they're trying to review aren't just typos and mal-commits. Tremendous amount of time can be wasted by trying to wrap your head around a commit somebody else has made, only to later find out it's just a bit of nonsensical noise that was committed by accident.
It's also a great advantage to be able to keep certain branches so clean and stable that every single one of their revisions is a rock-solid snapshot that can be checked out and branched from. When you can't keep your history clean you end up defining other methods of marking commits as good/bad.. and that's just silly. Some branches cannot afford bad commits.

Ultimately, Git's approach lets you treat your code's history with the same respect you treat the code itself. Having the ability to rewrite history with such great ease also facilitates more free experimentation and makes you commit more, because you don't have to fear polluting the global history with half-baked tinkerings. Git lets you alternately focus on coding and then SCM. You make your code as good as you can, and then you clean up the history as well as you can, before inflicting it upon the world.

1 comment:

glevr said...

Right. This also helps to enforce some coding conventions. For instance, a new developer have committed a change that brings together a code-cleanup, a few useful comments, a bug fix and a code-refactoring. We clearly do not want it to be saved as one commit. Without the ability to alter the history the only thing we can do it to yell on the developer to never-ever do it again. With this feature however we can divide this commit into a few separate commit.

Post a Comment