Backups should be a key part of any photographers workflow. It isn’t a matter of if you will have a storage device fail, but when. However, many photographers, amateur and pro alike, lack the background knowledge to understand what is really needed. In this post I will explore backup strategies for photographers including risk analysis, general backup strategies and discuss what is and is not a backup.
This is a high level post discussing backup strategies for photographers as a part of our series on workflow. It does not discuss specific tools in detail.
You might be wondering why I’m qualified to talk about this. Besides dealing with a couple of hundred thousand photos a year, all of which have to be backed up, my previous career was as a IT professional and software developer. At various times in that career I was responsible for helping others backup large and continually growing scientific data sets. I was also witness to people loosing years of work after they failed to take backups seriously. I also have a M.S. in Computer Science and studied the underlying concepts discussed here during my brief academic career.
Thinking About Risk
The first step in designing your backup system is assessing the risks, and then deciding which you can tolerate (given the cost to mitigate) and which are unacceptable. The goal of the backup system is to mitigate or eliminate the risk of data loss in situations it is designed to handle.
Before we go further, I’m going to introduce Bob. Bob is a serious amateur photographer and he is going to be our fictional case study.
Lets think about a simple backup system. Bob has a computer and an external hard drive attached to it. Whenever Bob downloads photos he carefully copies them to both his computer and the external hard drive. Bob does not want to loose his photos.
What risks does this system mitigate? Which won’t it help with? To figure that out, lets brain storm some common risks:
- Data corruption
- Hard drive failure
- Power surge (e.g. from a lightning strike)
- House fire
- Physical theft
- Local disaster (e.g. a flood or wild fire)
- Regional disaster (e.g. a major hurricane)
- National/global disaster (e.g. WWIII)
This list is not exhaustive, but it covers most of the likely scenarios. Think about Bob’s backup system. It will protect against a hard drive failure since it is unlikely that the drive in his computer and the backup drive will fail at the same time. It should also protect against most cases of data corruption since Bob doesn’t ever recopy files after initial import. However, it won’t mitigate any of the other risks. Anything that threatens all the electronics in his home (power surge, fire, flood, theft, etc) can easily destroy the original and backup at the same time.
Know Your Risk Tolerance
No one backup system is perfect for everyone. In fact, no one backup system is perfect for all photo shoots. Some people, especially those that shoot casually may want to make a “best effort” to protect their images but with almost no cost. Other people, in particular pro photographers, will place a high value on their image archives. They may even have a legal obligation to ensure that no minor disaster (i.e. a house fire, lighting strike or even something regional like a flood or hurricane) destroys images.
Bob the Amateur
Bob, as a serious amateur cares about his photos but the “cost” of loosing them is mostly emotional. All of the critical images could likely be recovered from online photo galleries, even if the RAW files were lost. Bob realizes he should probably have some sort of offsite backup (to mitigate risks like house fires and theft) where his images end up eventually. In the short term, any given shoot could be lost without Bob loosing any sleep.
Susan the Wedding Photographer
Susan on the other hand is a professional wedding photographer and has a legal (contracted) obligation to take all reasonable steps to protect her clients’ wedding images. Loosing someone’s wedding images would be emotionally devastating to Susan and her clients, never mind the damage it would do to her business.
Susan needs a backup plan that not only accounts for hardware failure, and local risks, but also local disasters (floods, wild fires, etc) and maybe even regional and national level disasters.
Bob and Susan have very different risk tolerances.
The Time Factor
Bob and Susan differ in another significant way. Even if Bob decides he needs to mitigate the same set of risks, he likely won’t need to do so on the same time scale.
Susan, as a wedding photographer wants to ensure that her clients’ images are backed up as soon as possible. She immediately uploads them into the cloud when she gets home from a wedding. This is expensive, time consuming and requires a big internet connection.
Susan sleeps better after a wedding knowing that her clients’ one of a kind images are backed up off site already. If her computer, local backup and the cards are destroyed that night or the next day, through a house fire for example, she can recover the images from her online backups.
Don’t forget to consider the time factor in your backups. How long a window of risk can you tolerate? Many photographers are willing to risk the loss of a small number of images over a week or two, but not their entire archive for all time. The time scale may vary by the type of shoot not just the photographer. Susan is much more relaxed about her personal for fun photography. She does not back them up to the cloud immediately, but makes sure it ends up in her online backups eventually in the week after shooting the images.
Backup Strategies for Photographers
Lets talk nuts and bolts now. What are strategies a photographer can use? What risks do they mitigate and how quickly?
A Local Backup Hard Drive
External USB hard drives are cheap and huge. It is simple to buy one and plug it in. You can manually copy your images over as Bob does, or you can employ and automatic system like Mac’s Time Machine to do the backup automatically. The net result is the same. Your images end up on the backup hard drive pretty quickly.
In this context, a “local backup hard drive” is any hard drive physically at the same location. It doesn’t matter if it plugs into the computer via USB, or is network attached. If it is located at the same address, it is local.
If, like Bob, you copy the images over as you import them, the time scale is nearly immediate. Your backup contains what you imported. With an automatic system like Time Machine, there is some lag, but not much since it actively updates the backup very frequently (sub 1 hour).
What risks does this mitigate? This type of backup (when used alone) mitigates the chances of a hard drive failure. That is a good thing since hard drives fail all the time. Unfortunately, when used alone, it mitigates almost no other risks. Anything that threatens Bob’s entire house, like a lightning strike or fire can easily destroy Bob’s primary and backup copies at the same time.
As long as Bob knows this and accepts the risk, everything is good. If not, Bob needs to consider how to improve his backup strategy.
Off-Site Backup Rotation
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. — Andrew Tanenbaum
A step up from a simple local backup is a local backup with off-site rotation. For the mere cost of 2x as many external hard drives, you simply keep one copy of the backup off site at all time. Enterprises use this type of strategy all the time, with rotation of backups off-site as frequently as daily.
Believe it or not, this is likely the most affordable strong backup option for many photographers. Hard drives are cheap. If one backup hard drive costs $100, then 2 costs $200. A one time $100 fee for off site backups of terabytes of image files is dirt cheap.
Bob Improves His System
Bob decides to add off-site rotation to his backup system. He buys a second hard drive like his first backup drive. Bob copies everything from his original backup (call it backup A) to his new backup (call it backup B). Bob then calls his friend Sam who lives across town and asks if he can stash one of his backup drive there. Sam says yes and Bob drops off backup A there, keeping Backup B to use for the time being.
Bob decides he is comfortable with the risk of loosing one month’s photography in the event of a personal disaster (a fire for example). Therefore Bob will rotate his backups monthly. On the first of the month he will take whichever drive he has at his house to Sam’s house and swap it. When he gets back home he will have to update the out of date backup he just brought home, which can take a while. Luckily it doesn’t require his attention so it is easy.
With this scheme, Bob has ensured that he has immediate protection from the most likely cause of data loss: a failed hard drive. He has also ensured that at most he will loose one month of photography if a disaster strikes his home. Bob is realistic and knows that if a major disaster strikes, say a hurricane that floods the entire city, Sam’s home won’t really protect his drive any better than Bob’s will.
The cloud is a brand new thing as human society goes. For the last several years most people have had enough upstream bandwidth to push large amounts of data into “the cloud”. But what is the cloud? Reduced to it’s simplest, “the cloud” is just someone else’s computers. That is a vast oversimplification but it drives home the salient point: the cloud isn’t magic. It’s just technology repackaged a slightly different way.
Susan’s Cloud Backups
For our purposes, you can imagine that a company has a hard drive labeled “Susan” and when she makes a backup to “the cloud” she is just copying her images over the internet to that hard drive. That hard drive might be in a building 2 doors down from Susan or on a different continent.
Types of Cloud Backups
Here is where things get more complicated because not all “cloud” backup systems are the same. There are 3 general classes for this discussion:
- Image hosting systems: Zenfolio, Photoshelter, 500px, Smugmug, etc.
- Cloud Backup Systems: Backblaze, Carbonite, etc.
- General Cloud Storage: Google Drive, Mac iCloud, Microsoft One Drive, Dropbox, just to name a couple of them.
To make things more complicated, the above categories are amorphous. You can generally use #3 (general cloud storage) for either of the other 2 through a front end app or service. Some of them even include that directly.
A word about security: while these all claim to be secure, they all involve humans (that’s you) and humans are ALWAYS the weakest link in security. If you put it in the cloud, someone you don’t intend can get their hands on it either through social engineering, actual hacking, or just mistakes on your own part setting things up (like choosing a stupid password). Maybe that is a deal breaker, maybe it isn’t. Just be realistic. Nothing “online” is hacker proof and any marketer that tells you otherwise is lying more than normal.
Regardless, lets take them one by one.
Image Hosting Systems
Image hosting systems such as Zenfolio and Photoshelter usually have an unlimited level where you can pay to upload as many JPG files as you want. Photoshelter and Zenfolio both have options to upload raw files and some other image formats. However, these type of systems are not general purpose backups systems. They are a place to dump edited image files and maybe RAW files. They generally require manual management of the archive structure (i.e. creating folders, uploading to the right place, verifying everything got uploaded).
Even so, if your primary concern is making sure your images are safe, they are a great option, especially if you already have them for some other reason. We, for example, have the highest level Zenfolio account so we upload all our JPGs. The backup this provides is a “freebee” since we use Zenfolio as our method of delivering images to our clients. They charge by the megabyte for RAW files so we don’t use Zenfolio for that. We actually have the unlimited Photoshelter account for that purpose only.
Regardless of which system you choose, it is important to remember that image hosting systems are not a general purpose backups. They only protect the images you manually upload to them. They can still play a big role, but won’t help with other assets like scans of signed contracts, logo vector files, etc.
Cloud Backup Systems
General purpose cloud based backup systems are the new player in this field. To be viable for photographers they require a big upstream (that is upload) internet connection.
Lets talk about latency… Susan uses a cloud backup system. They claim their software proactively backups up new files on her computer “instantly”. That is marketing talk, which means it is a flat out lie. If Susan photographs a wedding and comes home with 32 gigabytes of raw files, that takes a while (maybe 8 hours) to upload even on a fast internet connection. The time it takes to complete that upload is the latency of the backup.
That latency time is the window to loose some or all of the images waiting to be backed up. Of course, the only way to do better is to copy the images to a hard drive and then drive it somewhere. Like the quote above says, it’s hard to beat the bandwidth of a station wagon full of hard drives.
The advantage of this type of system is that the backups are relatively automatic and they generally handle any type of file, not just images. They usually charge a relatively low fixed monthly fee. Once the upload completes, your images and other files are off site.
General Cloud Storage
Google drive launched general cloud storage into the mainstream, but now a ton of people offer it (many selling products using Amazon or Google’s enterprise cloud storage services under the hood). Google give you 15 gigabytes and rising for free (as of 2018). That sounds like a lot, but it is nothing for a photographer. Amazon and other cloud services have similar small free accounts, none of which will work for a photographer wishing to backup RAW files, not to mention other files on their computer.
Of course, all these cloud service companies sell larger and even unlimited accounts. They suffer from the same latency problem that the cloud backup systems suffer from. You have to push all those gigabytes of RAW and JPG files through your internet connection which isn’t as large as you think.
The advantage of this type of system is you effectively have a huge network hard drive you can access from anywhere in the world. Need an old JPG or RAW file from 4 years ago while on vacation? No problem, most cloud storage systems are app accessible from your phone.
The downside is that most of these require manual updating or some add-on backup solution to keep them up to date. That isn’t impossible, but it is another moving piece to maintain.
Creating a Hybrid Systems
If you like things simple, you probably want to pick a single strategy (local only, rotated local or cloud) and go with it. Know what it protects you from and what it doesn’t, accept that and move on with life.
If on the other hand, you know what you want and need, and are willing to put in the time, you can create a hybrid backup system to satisfy your needs while minimizing costs and effort.
We’ve talked about Bob and Susan, now lets talk about Andrew (that’s me!). I’m a professional photographer. I shoot a lot of different stuff from weddings and commercial jobs to the alpacas my wife wanted to pet this weekend. I do not have any single level of risk tolerance. Some of my images are critical and need immediate protection (e.g. weddings) and others could be lost and I wouldn’t loose much sleep (alpacas). Most are somewhere in between (paid jobs that could be re-shot if disaster struck, most of my personal work, etc).
I use a hybrid backup system and I take different actions based on the type of shoot in question. In general, I have an always-on local backup that I rotate off site and a cloud based image hosting system: Photoshelter. Between these two I cover all the bases I need.
Local Backups – Automatic
I have local automatic backups of both my laptop and file server. I rotate the backup drives off site to Josh’s house approximately once a month. If disaster strikes my house, I could loose up to a month’s work assuming I didn’t back it up any other way. If a disaster wipes out both Pflugerville and Taylor, well, I’m probably out of business anyway. Of course, my images *should* be safe in the cloud…
Cloud Based Backups – Manually
As I mentioned before, we have the unlimited Photoshelter account that allows unlimited RAW file uploading. The only real drawback is that I have to manually upload the images. It is not automated. As a result, my process of backing up images to Photoshelter depends on the type of shoot.
If the shoot is a one of a kind, impossible to re-shoot type, like a wedding, I upload the images immediately when I get home to an “off camera backups” area I have in Photoshelter. Because of the network latency, that backup won’t complete for a while, sometimes as much as 24 hours for a large wedding. To protect against loss in the mean time, I keep the cards sacrosanct and in easy reach (next to the bed). Should their be an emergency at my house, the cards are the 3rd thing I grab on the way out: wife, cats, un-backedup wedding photos.
Most of my shoots don’t fall into that category. Instead I rely on my local backup system to protect against hard drive failures until I finish editing the images. Generally I finish editing paid customer work within a day or two and upload the finished JPGs to Zenfolio to deliver them. At that point there is a backup of the finished images at least.
Some time after that, I will archive the shoot. Part of that process is pushing everything: RAWs and JPGs, into my Photoshelter archive. Photoshelter is my master cloud based backup. Image might take a week or two to end up there, but once they are there, they are there.
Backups Are Living Things
Whatever backup scheme you set up, remember that a backup is a living thing. You have to feed and care for it. You should regularly check the viability of your backups by recovering a few random files to a temporary location. Verify that they are correct.
You should also be checking whatever software you use (if any) to handle automating your backups. Check to make sure that the quantity of data backedup up is rational. If you have a mostly full 2TB drive in your laptop and your backup drive has 200GB of data on it, alarm bells should go off. If the software has any warnings or errors, dig into those. Don’t just ignore them.
And finally, you have to remember to do your job. If you plan on using rotating off-site backups, you have to remember to rotate the hard drives. Google calendar or another automatic system of reminders is a good idea.
RAID is Backup and other Misconceptions
I’ve run across numerous misconceptions about backups and storage. Lets go over the major ones.
RAID is Backup
Probably the most common thing I hear is “I don’t need backups, I have a RAID drive” or “I have mirrored hard drives.”
RAID stands for redundant array of independent disks. It is a powerful tool but it (regardless of the RAID level you use) is NOT backup. Lets look at the best case: RAID 1, or what is commonly called mirroring. Two hard drives are kept in perfect sync. When you write a file onto the file system, the data is written (nearly) simultaneously to both hard drives.
Sounds like backup right? Superficially it is similar to an automatic local backup. It shares all the same weaknesses as local backups: it can be destroyed by any local disaster or simple theft. But it gets worse. When a hard drive in a RAID array dies, you replace it. This precipitates a full read of all the other disks and in some cases a lot of write activity too. You are basically stress testing all the drives in the array. Chances are they are all of similar vintage. I’ve seen numerous occasions where a single disk failure resulted in multiple additional drives in an array failing within the next 48 hours.
To further complicate things, any systematic problem (say file system corruption or a hardware defect in the model drive you use) is mirrored also. You instantly duplicate the corrupted data or trigger the same bug.
RAID is a great tool for what it was designed for: up-time and throughput. Solutions like mirroring help up-time because they allow a system to keep running while a failed drive is replaced. But they are not a backup system.
I Can Get The Hard Drive Recovered
A) Recovery is expensive.
B) It often fails totally depending on how the drive failed.
C) Even if it succeeds the data you get back is often mangled in some way.
Do not bet on data recovery as a backup strategy. I literally saw a grad student loose 4 years of Ph.D. research that way.
SSDs Don’t Fail
Wrong… If you finish any statement with “Don’t Fail” you are categorically wrong. The Titanic couldn’t sink either, and yet it did. SSDs have nearly as many failure modes as spinning hard drives. In fact, the most common failure I’ve seen in hard drives is not a physical issue, but electrical in the circuitry on the hard drive. SSDs have most of that circuitry too.
The lack of moving parts should improve reliability, but it is still early days. Even if it does dramatically improve reliability, they will still fail sometimes.
Hopefully you have a better understanding of the strategies you can use to backup your images and other data files as well as the trade offs between them. Before you pick a strategy, start by thinking about the risks you can tolerate and what you are willing to loose. Assume you will have a hard drive failure or other disaster at some point and design your system to recover from that. When the worst does happen, you’ll be ready.
He is a self taught experiential learner who is addicted to the possibilities that new (to him) gear open up. He loves to share the things he has worked out. Andrew started with a passion for landscape and night photography and quickly branched out to work in just about every form of photography. He is an ex-software developer with extensive experience in the IT realm.