Developing a Backup Strategy part 2 – Backup types, Schedules, and Protection

Will Willis

 

Last time around we began our discussion of backup strategies, which is part of an overall disaster recovery plan. We examined a few of the common types of backup solutions that exist, and then began focusing on performing tape backups. Also, we discussed the different types of data that you have to consider when deciding what to backup.

 

This time we’re going to continue our discussion of backups by looking at the different types of backups you can perform, and then discuss creating a backup schedule. We’ll conclude with some thoughts on offsite storage plans.

Full (Normal) Backup

The Full Backup is exactly what it implies, a backup where every file on your hard drive is saved; system files, applications, data, and the Registry if you are running Windows. Should you ever have a server crash this is the easiest type of backup to restore from because it requires only a single backup set (which is usually a single tape).

 

Getting a good full backup is sometimes difficult because by nature some files will always be in use and many backup programs require exclusive access to a file in order to back it up successfully. Files that are open or in use are generally skipped and noted in the backup log file. Many commercial backup programs have “agents,” which are plugin utilities to the backup software that allow the program to successfully backup open files. Two of the more popular backup programs, ARCserve and Backup Exec, have optional modules to backup open files as well as open databases from Third Party server products such as Microsoft Exchange Server and Lotus Notes. The bundled backup program with the Windows NT/2000 operating system will not backup open files nor will it backup open databases, though the Exchange Server product does include a replacement backup utility that will backup the open Exchange Server databases. The basic NT/2000 Backup utility will back up the system Registry though.

Don’t Underestimate Tape Capacity Requirements

Disk space usage on a server doubles roughly every two years. Ideally a full backup should be able to fit on a single tape, so keep growth in mind as you plan your backup strategies. Also be aware of the usable life of a tape.  You should replace your tapes annually to ensure the best quality backups. As a tape gets older it becomes more susceptible to stretching and wearing out, and like a cassette tape the metal oxide that holds the data to the tape begins to flake off, creating dead spots on the tape. In turn this will result in more tape errors and an increased risk of being unable to restore data from a tape when it is necessary. How often you have to replace your tapes will depend on the number of times you use them for backups. A moderately used tape in a two-week backup rotation should be replaced approximately once a year.

Incremental Backups

 

Whereas a full backup backs up the entire hard drive an incremental backup backs up only those files that have changed since the last backup. So how does the backup software determine what files have changed? This is done through the archive bit.

 

The archive bit is a property of a file that is manipulated during the backup process. When a file is backed up the backup software clears the archive bit for that file. That tells the software that the file has not been changed.  When a file is modified later by a user or by the system the archive bit is set again. That way the next time a backup is run, the backup software is able to tell which files have been changed since the last backup.

 

As we briefly mentioned, an incremental backup backs up only those files that have changed since the last backup of any type. It does this by clearing the archive bit for any file it backs up. That way only the files that have been modified (had their archive bit set) since the last backup will be backed up. If you have a server crash and need to perform a full restore you will need to have the last full backup tape plus all of the incremental backups that have occurred since that time. For example, you have a backup schedule where you perform a full backup on Friday and incremental backups Monday through Thursday. You have a server crash on Wednesday morning. To restore that drive you will need the previous Friday full backup plus Monday and Tuesday’s incremental backups. Incremental backups are ideal for daily backups because they typically backup far less information than a full backup and are therefore much quicker to perform. The disadvantage is that will be more tape shuffling than any other backup method as your data is spread across more tapes.

Differential Backup

A differential backup is very similar to an incremental backup in that it only backs up data that has changed since previous backups.  The difference though is that a differential backup backs up only those files that have changed since the last full backup. Whereas an incremental backup clears the archive bit from files it backs up, the differential backup leaves the archive bit set. This way if you need to restore a server the only tapes you will need are your last full backup and your last differential backup. The advantage to a differential backup is that like an incremental backup, only files that have been modified are backed up. The disadvantage as compared to incremental backups, however, is that each day that passes since the last full backup the data takes longer to backup. Since the differential backup does not clear the archive bit it will backup all of the files it backed up previously plus any new ones that have been modified. The differential backup is more advantageous than an incremental backup when it comes time to restore though, as you usually have less tape shuffling to do.

 

To use our backup schedule example above, a Wednesday differential backup will backup all of the files that were backed up on Monday and Tuesday as well as the files that have been modified on Wednesday. This is because it backs up all files that have changed since the last time a full backup was run.

Copy Backup

 

A copy backup is a type of backup that doesn’t manipulate archive bites at all. It can be used to back up the entire hard drive like a full backup, or selected files and directories. So what’s the point? A copy backup is useful for doing a one-off full backup of a server without throwing off your backup schedule. For example say you made a major configuration change on Tuesday night, installing a new piece of hardware and a new application package on the server. You want to do a full backup before you do this as a “just in case” measure.  If you do a standard full backup it will clear all of your archive bits, which will interfere with the incremental or differential backups that are usually based on the previous Friday’s full backup. So instead you do a copy backup which gets you what you want, a good full backup of the server, without interfering with your regular backup schedule.

Program Specific Backup

 

This is a special kind of backup that isn’t a part of a normal backup software scheme. Some programs, most notably databases such as Microsoft SQL Server, have an internal backup program that will backup to the record or field level and dump the contents out to another file on the drive to be backed up by your regular backup software. These backups must be configured within the program itself and are specific to the program they come with.

 

How you setup your backup schedule will depend on a number of different elements within your organization. The table below summarizes the backup types with their advantages and disadvantages.

 

Backup Method           What’s Backed Up       Archive Bit      Advantage       Disadvantage

Full         every file                cleared on all files               restore only needs one tape             time consuming

Incremental          files changed since last backup of any type  cleared on files backed up                fast         restore requires full plus all previous incremental tapes

Differential            files changed since last full backup                unchanged           faster than full, restore only requires full plus last differential tape    backup time is slower as days pass since last full backup

Copy       every file                unchanged           one off backup doesn’t affect backup schedule            time consuming

Planning and Scheduling the Backup of Your Enterprise

 

As you can see, there are a lot of considerations to take into account when it comes to planning a backup strategy. Once you have determined what types of backups you are going to do, you will need to develop a schedule for your backups. The point of having backups is to be able to restore data in the least amount of time with the least amount of hassle. Having a schedule gives you a documented plan - you know exactly when data was backed up and how (what method). Not all restores will be full server recoveries, often users will request certain files to be restored that have been accidentally deleted or otherwise damaged (such as a user saving unwanted modifications and being unable to revert to the original version). Your schedule allows you to quickly determine which tape to restore from without having to spend a lot of time searching individual tapes for the file you are looking for.

 

A backup schedule should be setup in a way that provides the maximum amount of data protection with the easiest amount of administration.  The best of these schedules is what is known as the Grandfather-Father-Son (GFS) method.

Grandfather-Father-Son (GFS)

 

The GFS backup strategy is a method of maintaining backups on a daily, weekly, and monthly schedule. GFS backup schemes are based on a five-day or seven-day weekly schedule (depending on your organization), beginning any day (though typically Friday or Monday). A full backup is performed at least once a week. All other days, full, partial (incremental or differential), or no backups are performed. The daily incremental or differential backups are known as the Son. The Father is the last full backup in the week (the weekly backup). The Grandfather is the last full backup of the month (the monthly backup). The tables below show an example of a weekly backup schedule using full and differential backups.

 

5-day Backup Schedule

Sun      Mon     Tue      Wed     Thur     Fri        Sat

None      Diff          Diff          Diff          Diff          Full         None

 

7-day Backup Schedule

Sun      Mon     Tue      Wed     Thur     Fri        Sat

Diff          Diff          Diff          Diff          Diff          Diff          Full

 

In general with a GFS rotation, you can re-use daily tapes after four days (five-day schedule) or 6 days (seven-day schedule). Weekly tapes can be overwritten after five weeks have passed since it was last written to. Monthly media are saved throughout the year, and should be taken off-site for storage.  The primary purpose of the GFS scheme is to suggest a minimum standard and consistent interval at which to rotate and retire the tapes. The following table shows an example GFS implementation over the course of two months, using a month with the 1st conveniently falling on a Friday, the day of the week of our full backup. While I use differential backups for the daily backups, the schedule would be the same using incremental backups instead.

Two month GFS rotation scheme on a five-day schedule

Month 1 (Full-W=Weekly backup, Full-M=Monthly backup)

Sun      Mon     Tue      Wed     Thur     Fri        Sat

1              2              3              4              5              6              7

None      Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape5

Sun      Mon     Tue      Wed     Thur     Fri        Sat

8              9              10           11           12           13           14

None      Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape6

Sun      Mon     Tue      Wed     Thur     Fri        Sat

15           16           17           18           19           20           21

None      Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape7

Sun      Mon     Tue      Wed     Thur     Fri        Sat

22           23           24           25           26           27           28

None      Diff          Diff          Diff          Diff          Full-M     None

                Tape1    Tape2    Tape3    Tape4    Tape8*

Sun      Mon     Tue     

29           30           31

None      Diff          Diff

                Tape1    Tape2

Month 2

Wed     Thur     Fri        Sat

1              2              3              4

                                                Diff          Diff          Full-W    None

                                                Tape4    Tape5    Tape9   

Sun      Mon     Tue      Wed     Thur     Fri        Sat

5              6              7              8              9              10           11

                Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape10

Sun      Mon     Tue      Wed     Thur     Fri        Sat

12           13           14           15           16           17           18

None      Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape5

Sun      Mon     Tue      Wed     Thur     Fri        Sat

19           20           21           22           23           24           25

None      Diff          Diff          Diff          Diff          Full-M     None

                Tape1    Tape2    Tape3    Tape4    Tape11*

Sun      Mon     Tue      Wed     Thur     Fri        Sat      

26        27        28        29        30        1          2

None      Diff          Diff          Diff          Diff          Full-W    None

                Tape1    Tape2    Tape3    Tape4    Tape6

 

From the above rotation you can quickly calculate that it will take a total of 21 tapes. There are four daily tapes (Sons) that are recycled (reused) weekly, five weekly tapes (Fathers) that are recycled after the fifth full weekly backup is complete, and 12 monthly tapes (Grandfathers) which are the last full backups of the month and are taken offsite. The need for storing monthly tapes leads us into our next topic.

 

Protecting Your Backup Tapes

 

You’ve been diligent in your backup practices, performing a perfect GFS tape rotation and keeping your tapes readily available on a shelf in your server room. What’s wrong with this picture?  While your data is protected against a server failure, it is not in this scenario being protected against natural disaster. What if a fire broke out in your server room from a wiring fault? Your precious backup tapes just got melted down along with the server, and all data is lost.  A wise investment for the server room is a heavy duty fireproof safe for storing tapes. That way, if you have a fire, your tapes should be able to survive.  But what if you have flooding? Even if you don’t live in a flood plain, stranger things have happened than one of the restrooms on the floor above you having a busted pipe that causes water to rain down through the ceiling tiles. Fireproof safes aren’t waterproof, and while your server shorts out and crashes your tapes are going under water.  What if your building is hit by a tornado? The only way to really ensure that your data is protected is to store copies of it offsite.  In the GFS tape rotation the monthly backups, or Grandfathers, are taken offsite for storage. If you have a natural disaster that strikes the office building and server room and wipes out you data, you have the ace in the hole so to speak and can probably recover enough of the company data to at least be able to carry on with business. In addition to protecting against natural disaster, storing tapes offsite also protects against the possibility of losing all corporate data to theft or vandalism.

 

Summary

In the last two articles we have looked at backup strategies as part of an overall disaster recovery plan. While not the most glamorous part of a system administrator’s job, protection of data is arguably the most important function of an administrator. By determining your backup needs and then devising a schedule to meet those needs, you’ll ensure that if disaster does ever strike you will be able to recover from it.

 

Question or Comments? Will can be reached at WWillis@Transcender.com