The server that hosts this blog is one that I administer; it runs Ubuntu, so I’d been in the habit of doing an OS upgrade every 6 months, as Ubuntu releases a new version. I’d do an upgrade in place; most of the time it would work smoothly, sometimes it would be less smooth but I’d be able to figure it out. (For example, the upgrade to Apache 2.4 brought in some pretty significant configuration differences.) But every once in a while something would go horribly wrong to the extent that it wouldn’t even boot after the upgrade or networking would be broken, and if I couldn’t figure it out, I’d have to build a new server from scratch, copying things over from the old server.
Upgrading in place had been starting to feel wrong, though. I’m willing to accept that this server is, to some extent, a pet instead of cattle, but still, that doesn’t mean that I want random packages and configuration accumulating. I’d been taking notes on those occasions when I did have to rebuild from scratch, and the last time I did that, I figured out the exact packages I needed to install instead of just blindly copying over the list from the previous server; so rebuilding isn’t a big unknown, though it does take more time than an in-place upgrade.
Anyways, when I went to upgrade from 19.04 to 19.10, networking broke completely, so it was time for another rebuild. And that got me thinking: I should just give up on the strategy of upgrading in place, it’s too error-prone, and leads to too much downtime. (I’ll talk about that a little below.) But I didn’t feel like rebuilding the server every 6 months; so I decided to get on the Ubuntu LTS train. In general, I like to do things incrementally, but I already had evidence that the incremental upgrades weren’t actually all that smoothly incremental in practice, so I figured doing one full rebuild every two years was a better use of my time.
Which raised the question of what to do about 19.10. It made me nervous, but I decided to skip it: if a bad security vulnerability showed up, I’d have to figure out whether to do an emergency rebuild or to hope that Ubuntu would backport the fix to 19.04, but I was optimistic that nothing horrible would appear in the three months until 20.04 appeared.
I like to do server maintenance on long weekends, so this weekend I rebuilt a server with 20.04. And it went really smoothly! My notes were good; the main thing that had changes since last time was that I’d started using Let’s Encrypt, so I had a new section to add to my notes, but it was super simple, I just had to add one directory to the list of directories that I had to copy over from the old host. (It’s a pretty short list: my home directory, the directory that contains websites, the mysql data directory, the Apache sites-available directory, and now the Let’s Encrypt directory.)
And, right from the beginning, some surprise benefits of the strategy showed up. There were some packages that I’d had pinned at old versions on my previous host, because something weird had happened in an upgrade; I don’t remember the exact history there, but I had more or less resigned myself to losing one bit of minor functionality, but the package was still there on 20.04 and worked fine. (It might have been in the Universe repository instead of the main one? But I needed that for something else anyways.)
Also, right from the beginning: much less down time. If I’m going to upgrade in place, then I need to stop the old server, take a snapshot just in case something goes wrong (which takes half an hour or so), then do the upgrade, then hope everything goes well. So there’s a noticeable and potentially unbounded amount of down time; honestly, the number of 9s for this server isn’t that high (since I stop it once a month to take a snapshot anyways), but still, I don’t like downtime. Whereas if I’m building a new server, then I can leave the old one running while I set things up on the new server; if I were getting constant comments on the blog or something, then syncing mysql over could be a little delicate, but I don’t, so I don’t have to worry about writing to the old mysql for an hour or two. (I just have to make sure to avoid reviewing Japanese vocabulary during that period.)
In the past, rebuilding the server had caused delays because of DNS propagation; if I’m thinking about it, I can turn down the TTL a day in advance, but still, kind of a pain. But Digital Ocean finally added floating IP support a few years back, and I’d turned it on a few upgrades ago. (Which actually turned out to be another thing that was improved by the rebuild – initially you had to do some magic configuration on your server for that to work, but Digital Ocean improved things so that was no longer required on new servers.) So, once I thought I had Apache working, I could just flip over the IP and try to hit the web pages; turned out it was broken, so I flipped the IP back while figuring things out, and flipped it again after I’d fixed the problem. (I’d gotten something wrong when copying files over: I’d even left myself a note saying “pay attention to this potential mistake”, I just hadn’t actually taken that note seriously…)
Anyways: took a few hours, but all things considered, it was quite smooth. And the floating IPs continued to be useful: e.g. once I had the new server working, I wanted to take an initial snapshot of it, and that’s fine: the old server was still running, I could point the IP back at the old one while taking the snapshot.
I guess if I wanted, I could go even further: Digital Ocean has an RDS analogue now, so I could switch to using that as my database. Or I could move over to AWS, just to be in a slightly more familiar setting? I have mixed feelings about Digital Ocean, but it’s been okay, though, and it’s possibly a better match for this server than AWS, though. (Especially now that they’ve fleshed things out a little more.)
Pleasant enough way to spend a morning, at any rate: good to keep my hands dirty with this sort of thing, and always nice when computer maintenance goes well.
Post Revisions:
This post has not been revised since publication.