Backup and versioning strategy

June 6, 2016 · Backup Security Data Versioning Git

Hard drive cover

How often do you backup your data? How safe is it? Did you try to restore it recently? Did you try to restore it at all?

A while back, Scott Hanselman published an article on his blog about this topic, and pretty much at the same time, one of my sisters experienced a small backing strategy hiccup:

  • Fail to backup correctly
  • Fail to notice it
  • Live in a rainbow & unicorn world where your data is safely running on marshmallow plains
  • Lose your original and realize that you have no backup
  • Have a nice day!

As Scott H. states it (go read his article), here are the two most important points about backing up: make 3 heterogeneous (both on-site & off-site in different formats) copies of your data and try restoring data from your backups.

I personally have a few backups in place following that exact theory, but for me backing up goes further than just replicating data. It also goes into versioning it.

In my closet

The first and central piece of my backup strategy is a Network Attached Storage (NAS). This is simply a small server (Synology) running on my local network on which I push all the data I have. There is nothing on my laptop that I couldn't loose.

I have restored my Surface a few times already and never lost more than the few hours of setting everything back up to be fully productive.

All the data are accessible directly from the NAS on my home network, and to all of my devices (PCs, mobile phones & tablets).

The NAS itself is also in RAID (redundant array of inexpensive disks) mode and thus also replicates the data itself.

The safe

The second piece of backup is a payed service called CrashPlan. It replicates some or the data present on the NAS to the Crashplan Cloud. I used to have Crashplan on my machine as well but since I added the NAS to the picture, the central data pool is replicated, not my client itself.

This is really the "configure and forget" service. I get an email a month from Crashplan giving me infos about the storage variation. If it doesn't go up, I have to look into it. So far, so good.

The butler(s)

My third backup strategy is a domain specific one.

The mobile phone pictures are automatically downloaded by iCloud and saved on the NAS. All the pictures I want to keep are pushed to Flickr. This still requires some manual steps to pick a subset of the pictures I produce and upload them with the corresponding tags. But it's worth it.

For the day-in-day-out / multiple device stuff I might want to access from "outside", I use Dropbox.

Additional Disk

I used to have an additional disk on which I would duplicate the Crashplan Data regularly. This disk would only be connected for a few hours, long enough for the duplication to happen. I would then unplug it and leave it somewhere in the house. Since I added the NAS to the picture, I stopped doing this.

If I were paranoid, I would do this... and put the disk somewhere else. And encrypt it... but I'm not.

Keeping tabs, versioning

With this, I more or less have the 3 tier backup Scott Hanselman was talking about. But in my opinion, there is one key element that is missing: the versioning strategy.

What if you don't like the changes you did in the past 10 hours? Do you have a way to roll back? Do you have a way to compare the two work states? What if your Word Document becomes corrupted and you cannot open it anymore? Any way of getting back to a version X minutes in the past?

Dropbox brings some versioning with it, the history of every file that is modified is kept for a while. Unfortunately, this is only done when you are online, and you don't have control over it. You just know that "some version" is regularly saved.

Versioning with 'git'

For my personal projects, I started relying on the source control system git. And by "project" I mean "every thing I do on my computer": coding of course, but also writing documents, creating presentation, writing articles, drawing images etc. For all the work I produce where some kind of versioning might be interesting, I create a git repository and assign a master in the Cloud. Depending on the use of the project, it will end up on Github (public) or Bitbucket (private).

Creating a repo on Github and Bitbucket is pretty easy and using SourceTree from Atlassian makes Git usage quite straightforward.

I'd recommend using this whenever you need more control on what you are creating, and don't even bother starting to write something (yes I'm looking at you little sis') without thinking about the versioning and backup strategy for your work.

Learning git can be quite overwhelming, but what you really need to start is quite limited:

  • create a repo
  • add a remote master (ex: on github or bitbucket)
  • commit changes
  • push changes to the remote master

With this you can already work alone and version your work.

Have a look at this article from Wired Magazine if you are not convinced: from Collaborative Coding to Wedding Invitations: GitHub Is Going Mainstream.

One more step could be to replicate my git repositories on the NAS, so that I only rely on Bitbucket and Github for its availability and not for its redundancy. But I'm pretty sure both companies will be there tomorrow still.

Summary

  • Local backup on NAS with RAID redundancy
  • Automatic duplication of the most important data on Crashplan
  • Domain specific savings on Flickr, iCloud and Dropbox
  • Versioning with Git on Bitbucket and Gitub

And you?

  • How do you backup your stuff?
  • Do you use some kind of versioning as well?
  • If your house burned, your laptop was lost or your USB-Stick suddenly didn't respond... how would you feel - datawise - about it?

Note: this post was as much for you as it was for me, in order to summarize my efforts and check if there are major gaps in my strategy. If you saw something, let me know!

Img Source: Laptop HDD from Chris McClanahan (CC BY-SA 2.0)

Comments powered by Disqus