I seem to have version control systems on the brain somewhat these days. Until not very long ago, I didn’t think much of them: they’re obviously important to have around if you’re collaborating on software, but other than that, who cares? But then I started getting addicted to looking at diffs after having checked in code, and thought that, perhaps, they might be important if you’re working on software even if you aren’t collaborating. And then I started getting more involved in the administration of the server hosting this blog, and began to appreciate that they’re useful for storing other kinds of information: in particular, we store all the configuration information in Subversion repositories.

And then Jim Blandy pointed me at the figure in this discussion of ZFS and this figure explaining Subversion’s “bubble-up method”. (Hey Jim, start updating your blog and I’ll actually link to it! :-) ) And I started using Time Machine; as John Cowan reminded me, I’m using it in ways that are seriously lacking from the “catastrophic recovery” viewpoint of backup (if my house burns down, I lose that backup, too), but of course backups are useful for other reasons, namely to save you if you make a stupid mistake and accidentally delete something you wish you hadn’t.

So now I’m starting to wonder: are we moving to a world where VCS-like functionality will be considered a part of basic filesystem functionality? At the very least, every time I create a new directory, I now ask myself if I should place it under revision control; frequently the answer is “yes”. For example, I’ve now placed my GTD files under version control; this seems like a particularly good fit, because GTD is all about removing subconscious worries, and now I don’t have to worry that I’m deleting information as I check off steps in a project that I might want to refer to later! (I’m using Mercurial for that repository, initially for a change of pace and because it’s one of the cool new kids on the block, but after thinking about it more, I can easily imagine wanting to clone that repository to this laptop and make commits while I’m off the net.)

There are some interesting design decisions here. On a basic level: do I want to back up everything (with a revision history), or do I want some files/directories to be excluded? (Browser caches are one candidate for exclusion.) And do I want to be able to rewrite history? In particular, do I want to have the ability to remove confidential information from all backups?

How intentional should the creation of new revisions be? I can imagine a filesystem that stores every single change as a separate revision, or one that takes revisions periodically (as Time Machine does). But, in a VCS, revisions generally serve some sort of communication purpose: if I’m working on source code, I don’t want every time I save a file to be reflected in the revision history, I want control over when I commit. Not so clear in other instances; I’m thinking of setting up a cron job to do a commit on my GTD info every night, and reserving manual commits for special occasions.

I’m also wondering what other lessons filesystems could learn from VCSes. For example, distributed VCSes are all the rage these days; could networked filesystems learn something from that? (Though I can’t quite envision how the merge problem will get solved there.) Are there situations (cloud computing?) where we can get some mileage from relaxing consistency guarantees and viewing different filesystems as repositories with a genetic relationship, with changes periodically pushed/pulled in one direction or another? Maybe we should pay more attention to asynchronous notifications, either in the filesystem world or the VCS world; I spend enough time looking at networking problems through a RESTful lens that I get the feeling that asynchronous notifications are a bit out of style, but I’m appreciating them at work more and more these days. (Hmm, Twitter suggests that they’re not out of style at all, doesn’t it?)

No big conclusions, and I’m sure other people have been aware of some of these issues for decades, but I’m enjoying noticing these things.

Post Revisions:

There are no revisions for this post.