My Data Backup Strategy

When I started recording music digitally, the idea of backing up my work became important to me.  But my recording output was such that backing up data to CD worked fine and I was happy with that for many years.  Then the wife and I had kids and all of a sudden our photos became a hell of lot more valuable.  So we bought a backup drive and happiness ensued again.  Then I had my old web site hacked and infested with spyware.  Fortunately I had the (spyware-clean) development version of the site on my laptop and was able to restore it.  About a month later that laptop crashed, and I finally started to get the idea that my backup was insufficient.  And when I started getting heavily into photography, I realized that the volume of photos I would take would probably make CD backup too unwieldy.  It became clear that I needed to think a bit about it and come up with a reasonable backup strategy that I wouldn't outgrow too quickly but not create an over-engineered Rube Goldberg type process that got in the way of actually getting shit done. Truth is, I was a lucky fool not putting something in place sooner.  The service life of a typical computer hard drive made it a matter of time before I would get seriously burned.

So here I'm going to describe my backup strategy for photos, although it also applies to my music and web site files, and may one day apply to video files as well.  Not being dependent on any of these endeavors for my livelihood, my backup strategy is fairly simple but seemingly complete.  If you find a hole in this strategy, let me know; I'd love to fix it. 


First, I have some general rules I try to follow in the strategy and in my process of working with photos.
  • Three layers of storage.  By "layer" I'm referring copies of the photo on some type of media -- memory card, hard drive, optical disk, tape, online storage, etc.
  • Redundancy at all times.  A photo should exist in at least two layers at any given time.  So if one become inaccessible, the other is, well, the backup.
  • Off-site storage.  One of the layers must be located in a different physical location than the others.  This is to have a backup in case of a fire or natural disaster at my home that could destroy more than one layer of backup in one go.

What I Back Up

This is a more involved question than it may first appear.  Many of my photos go through multiple stages of development incorporating different tools throughout the process.  Should I back up the results of each stage along with any ancillary files generated by the tools?  For instance, Photomatix generates a .xmp file with the settings used to generate the HDR image, in addition to the image itself.  Software developers would back up everything at each stage since configuration management is a critical issue, and photography post-processing can very well be a CM issue too, especially at the professional level.  I'm not a pro, but being a software guy at heart I went back and forth on whether this was a CM issue for my purposes.

But I reflected on something from my music recording experience:  After some period of time (anywhere from a few weeks to a few years), I can safely conclude that I'm no longer interested in working on a project.  I'll listen to the song but I'm not going to record more tracks, or re-mix it.  In addition, after about 5 years or so, I may no longer have access to the tools I originally used to create those interim files because I've migrated to newer tools.  So keeping those files around is just an unnecessary nuisance.  Photography is really the same way.  After a certain amount of time, I'll look at a photograph but I have no intention of tweaking it anymore; I've moved on and I'm focused on new ones.  The criteria I use for determining whether something should be put into long term backup is, "Will I really need this 20 years from now?"  And the answer to that is "No" except for the final output images.

My approach then is to back up interim development files for some limited period of time while I still might work on them, but to back up the final image files indefinitely.

Backup Layers

Layer 1 Working Copy.  I transfer all my camera images, taken in both RAW and JPG format, to my laptop where I do my post-processing.  I do this as soon as possible after shooting them because until I do, my photo exists in only one place which violates the first rule above. The original camera images and  interim development files on my laptop make up Layer 1, my "working copy" of the files.  I leave the original images on the camera card until later in the process because if I deleted them immediately then they would only exist in one place again!

Layer 2 Near-line backup.  Layer 2 is a Seagate GoFlex Ultra-portable external USB hard drive that mirrors the working copy using FreeFileSync, which is a wonderful (and free!) file synchronization tool.  FreeFileSync will make an maintain a duplicate of Layer 1, copying and deleting files on Layer 2 as required.  Should something happen to my laptop's hard drive, I can restore my working directory using Layer 2.  Once the files have been propagated to Layer 2, I can delete the originals from the camera card since the images now exist in two places.  When I think I no longer need access to interim development files, I delete them from Layer 1 and FreeFileSync will automatically delete them from Layer 2.  That's a permanent deletion in this strategy so I do not take this decision lightly.

Layer 3 Online backup.  I use Google Drive to store final output files (in JPG format) for long-term storage.  I do not back up RAW images or development files to long-term because they don't pass the 20 year litmus test.  Online backup satisfies the goal of off-site storage.  It also provides additional levels of backup on top of my strategy:  First Google Drive will synchronize stored files to any number of computers you have, so my photos get copied over to my home desktop computer as well.  Also, the Google Drive service includes redundancy and backup of their servers' drives, so I can have a higher degree of confidence they're not going to lose my files.  Now, there are of course several alternatives to Drive, but I chose Google's service because its disk space can be shared with Blogger and Picasa Web which I also use so I can pay one company and apportion the use of space as I see fit.  Also, if the level of integration between Drive and those other services is enhanced in the future which seems like a reasonable possibility then I'll be well-positioned to take advantage of it.  Finally, since I'm using it for long-term storage I wanted to choose a company that seemed like it would be around for a long time and Google is about as solid as its gets in the Internet world.

This is all integrated into my so-called "workflow" which is a highfalutin name to the series of steps one uses to develop a photo from shooting to the final image that gets posted or printed.  It's a fairly cheap solution:  As of the time of this writing, about $80 for the GoFlex drive, and $60/year for 100GB of storage on Google Drive.  There are cheaper online storage solutions, but the integration/storage sharing of Drive with Picasa and Blogger is worth the extra $10/year or so.