How to: Create a proper backup process and why you need one

2
222
Fujifilm LTO Ultrium 3 Data Cartridge
Fujifilm LTO Ultrium 3 Data Cartridge

Problem / Outcome Summary

  • This article will show you what a traditional backup process is and why you need one
  • Please see the ‘Summary Overview’ tab below for a high level view of the objectives this ‘howto’ will achieve.

Why might I want to read this?

  • Because you value your data (photographs, documents, music etc).
  • Because this is what the large corporates do (who’s businesses would fail if their data was lost) and obviously they have a lot of experience in knowing how to protect data

To be clear, what is the difference between a regular backup and a proper backup?

This article comes from many conversations I have had with others, some of which enter the backup realm.  Some people are content with a single copy of their data, existing in their house somewhere, some people like to take that copy to another location, some people like to automate it, or put a structure around it and some people, still after many conversation’s have no idea why they need to back up at all.  Generally, I’ve found that there’s either a certain amount of circumstance and practicality around this decision or a certain amount of misunderstanding, which I want to call ignorance, except that would imply something which doesn’t apply to people whom are clearly intelligent.  However, it truly is surprising how much some may not want to listen to reason, which from my experience always changes once they’ve lost the last 10 years of their family photography to a hard drive crash, fire or burglary.

To that end, a true and proper backup would protect you from all of these types of failures, not just a single type.  There most definitely is a place for lesser backups (I have some of my content that is never backed up even once), but we all do need to understand the appropriate solution for the appropriate need so an informed decision can be made.

So what are the different types of backups?

Well, some of these are not officially ‘types’ in the IT world, however they serve their purpose in certain circumstances.  Please let us know if you know of a different type of backup, not listed here so we can add it.

  • Single Copy Local: A complete copy of the relevant data, stored locally, most likely on another separate device and fully able to be overwritten by another single copy
  • Single Copy Offsite: A complete copy of the relevant data, stored offsite, most likely on a separate device and fully able to be overwritten by another single copy
  • Single Copy Internet / Cloud: A complete copy of the relevant data, stored on the internet and fully able to be overwritten by another single copy.
  • Multiple Copies Local:  Multiple copies that are stored locally, most likely on an external device and able to be overwritten by another single copy
  • Multiple Copies Offsite: Multiple copies that are stored externally, most likely on tape, and not able to be overwritten by another single copy unless it is scheduled to do so
  • Multiple Copies Internet / Cloud: Multiple copies that are stored on servers accessed via the internet or ‘cloud’ generally cannot be overwritten, unless scheduled to do so

[alert color=”EBA132″ icon=”fa-align-right” title=”Please note:”]These are the types of backup ‘storage’, and should not be confused with the method of backup file selection and transfer.  Backup file selection and transfer methods can include, Incremental, Full, Partial and even delta which are not discussed here.[/alert]

Right, so what is the practical difference between the different types of backups and why do they matter?

This can be summed up by thinking about the disaster scenarios.   Common scenarios include, accidental deletion and drive failure, not so common scenario’s include file or drive corruption, fire, flood and theft.

When you consider each of these scenario’s, you can see that different backup methods are required to fully mitigate against the particular failure.  To make this easier to understand, we’ve created the table below:

Accidental DeletionDrive FailureFile/Drive CorruptionFireFloodTheft
Single Copy LocalPartialYesNoNoNoNo
Single Copy OffsitePartialYesNoYesYesYes
Single Copy InternetPartialYesNoYesYesYes
Multiple Copies LocalYesYesYesNoNoNo
Multiple Copies OffsiteYesYesYesYesYesYes
Multiple Copies InternetYesYesYesYesYesYes

In the table above, you can see clearly that a local copy obviously does not protect you from catastrophic failure and that an offsite copy does.  This is obvious as clearly fires, thefts or floods can complete destroy all the copies of data within a single location.

What is interesting though, is a single copy does not protect you from file corruption and only partially protects you from file deletion.  This is because when a file is deleted, and depending on how your backups are copied, you may actually be telling the system to remove any files that do not exist in the source.  With only a single copy available, you have now lost the point in time where that file existed.  In addition, when a file becomes corrupted, the file looks normal to the copy process (even though it can no longer be opened) and as such the backups actually copy the corrupted file to the backup target as is.  This then means that your backups now have no copy of the original uncorrupted file.

To mitigate this, multiple copies that are retained for periods of time must be kept.  Companies have been using various systems to create multiple offsite copies for years, for the very purpose of ensuring the data is able to be restored from any point in time, should a disaster occur.  Arguably, the most famous of these processes is called, ‘Grandfather, Father, Son’ or GFS for short.

But before we explain we also need to consider, what type of files constitute a more serious backup vs what kind of files could for example, just have a single copy.

Documents

For most people, documents stored on computer would be the most obvious first choice when choosing a more serious solution.  Documents tend to get added to, edited and moved around, often many not being re-opened for years and they’re small in size.  Documents are often important, but also, unless you’re filling them with images, documents are usually quite small, making them cheap and easy to back up.

Pictures

Pictures are also often important, given that most people keep years of photo’s on their computer hard drives and never make an additional copy or print them out.  Photo’s however, can end up using quite a lot of space which creates some challenges when creating multiple backups.

Music

Music is also quite large, usually somewhat similar in size to pictures, however, music usually comes from sources that can be recreated.  For example, if you’re an iTunes user, you can re-download the music again automatically on any device using your iTunes account.  If you made the music from CD’s, you may be able to recreate from CD again and if you’re someone that got your music from somewhere else, you can probably get it from there again.  For these reasons, I’d recommend thinking twice before delving into a complicated backup regime.  Perhaps just create a single offsite backup.

Video

Video is an interesting one and comes in many forms.  Some people have large video libraries converted from physical media, some people have home video converted from VHS, 8mm, MiniDV, MP4 formats etc.  Video can get very large and storing it with complicated backup regimes can consume a lot of time, effort and storage media.  I’d still advocate for at least one, offsite copy of some sort.

Other

There are still other types of data to backup.  Databases and email for example have their own challenges, but those are more complex examples that we won’t be covering here.

GFS Backup

So a ‘Grandfather, Father, Son’ backup is arguably the most popular tried and true backup process that wraps around your actual backups.  It stipulates when to make a backup, of what type and how long to store it for.  The idea, is simply to be able to go back to a point in time of your choosing, be it days, weeks, months or even years.  How long is up to you and is only limited by the amount of storage you have available.

So let’s have a look at a typical GFS schedule below so you can see what I mean.

DailyWeeklyMonthlyYearly
Number of Tapes8467

The table above represents the typical number of storage devices (still most commonly tape) required for a standard GFS backup.  18 tapes would be required, plus one tape for every year you wish to store your data.

How GFS works is quite simple:

  • Label the tapes Daily 1-8, Weekly 1-4, Monthly 1-6, Yearly 1-7 (or just write the year on these if you intend to keep them indefinately).
  • Then, Monday to Thursday, rotate each of the 8 ‘daily’ tapes into your backup scheme
  • Each Friday, insert each one of the Weekly tapes instead of the daily
  • On the last Friday of each month, utilise the monthly tape in order (6 monthly tapes are chosen, but 12 could be used if it was appropriate)
  • And finally on the last business day of each year, backup to the yearly tape.

By following this method, you can go ‘back in time’ with your backups, using minimal devices (tapes, USB sticks, DVD’s etc) up to two weeks on a daily basis, 4 weeks on a weekly basis, 6 months on a monthly basis and of course yearly.

To make this all work, (assuming you’re doing it manually) you’ll need to publish your backup schedule and ensure the right tapes or devices are attached to your backup software each day.

This can best be explained by example.  Please see the below schedule created for Jan/Feb 2015 where the Months are going down the page, and the date of the month is going across the page.  Weekends are noted as backups are not run on those days.  You can of course elect to modify to your choosing.

(Scroll across the table by clicking your mouse in the table and using your right arrow key to move across)

201512345678910111213141516171819202122232425262728293031
JanuaryDaily 4Weekly 1SaturdaySundayDaily 5Daily 6Daily 7Daily 8Weekly 2SaturdaySundayDaily 1Daily 2Daily 3Daily 4Weekly 3SaturdaySundayDaily 5Daily 6Daily 7Daily 8Weekly 4SaturdaySundayDaily 1Daily 2Daily 3Daily 4Monthly 1Saturday
FebruarySundayDaily 5Daily 6Daily 7Daily 8Weekly 1SaturdaySundayDaily 1Daily 2Daily 3Daily 4Weekly 2SaturdaySundayDaily 5Daily 6Daily 7Daily 8Weekly 3SaturdaySundayDaily 1Daily 2Daily 3Daily 4Monthly 2Saturday

Backup Software

The problem with all these scenarios, is it becomes difficult to keep offsite backups current.  If you’re a big corporate standing to lose a lot of money then this isn’t a problem as you’re likely to pay a specialist company to come in each day and collect your backup media.  If you’re not a large company however, this is more of a challenge.  Not only do you have to decide how often is practical to take a new copy off site, but you have to actually run the backup and consider how much media to buy.  Luckily, these day’s we have more options available that ever before.

Local Backup

One of the best backup systems around, that runs a system very similar to GFS, is Apple’s “Time Machine”.  This of course, is only available if you own a mac.  It is also possible to combine Time Machine with an offsite backup scenario.

If you run Windows, there is no backup solution available that automates any sort of GFS system.  The closest thing is shadow copy, but it’s really not that close.  A better solution would be to get third party software such as the free Genie Timeline.

If you’re lucky enough to run a NAS device, be sure to check out the snapshot feature.  If your NAS does not have a snapshot feature (QNAP doesn’t), then be sure to check out rsnapshot, of which a guide will be coming to Tech-KnowHow.com soon.

Internet Backup

These days, there is of course the internet.  This makes the challenges of organising of moving files offsite redundant, introduces advantages such as being able to backup at more frequent schedules than daily and obviously can include weekends.  Generally there is a fee for this type of backup and you would need to weigh up the cost vs the convenience, features and risk.  We’ll be doing a review on some of these internet or ‘cloud’ based backup services in the near future, so be sure to check back.

So there you have it, a host of information you never knew you needed about backups.

As always, I welcome your insights and opinions in the comments section below.

Marshalleq