Last time around we began our discussion of backup strategies, which is part of an overall disaster recovery plan. We examined a few of the common types of backup solutions that exist, and then began focusing on performing tape backups. Also, we discussed the different types of data that you have to consider when deciding what to backup.
This time we’re going to continue our discussion of backups by looking at the different types of backups you can perform, and then discuss creating a backup schedule. We’ll conclude with some thoughts on offsite storage plans.
The Full Backup is exactly what it implies, a backup where every file on your hard drive is saved; system files, applications, data, and the Registry if you are running Windows. Should you ever have a server crash this is the easiest type of backup to restore from because it requires only a single backup set (which is usually a single tape).
Getting a good full backup is sometimes difficult because by nature some files will always be in use and many backup programs require exclusive access to a file in order to back it up successfully. Files that are open or in use are generally skipped and noted in the backup log file. Many commercial backup programs have “agents,” which are plugin utilities to the backup software that allow the program to successfully backup open files. Two of the more popular backup programs, ARCserve and Backup Exec, have optional modules to backup open files as well as open databases from Third Party server products such as Microsoft Exchange Server and Lotus Notes. The bundled backup program with the Windows NT/2000 operating system will not backup open files nor will it backup open databases, though the Exchange Server product does include a replacement backup utility that will backup the open Exchange Server databases. The basic NT/2000 Backup utility will back up the system Registry though.
Disk space usage on a server doubles roughly every two years. Ideally a full backup should be able to fit on a single tape, so keep growth in mind as you plan your backup strategies. Also be aware of the usable life of a tape. You should replace your tapes annually to ensure the best quality backups. As a tape gets older it becomes more susceptible to stretching and wearing out, and like a cassette tape the metal oxide that holds the data to the tape begins to flake off, creating dead spots on the tape. In turn this will result in more tape errors and an increased risk of being unable to restore data from a tape when it is necessary. How often you have to replace your tapes will depend on the number of times you use them for backups. A moderately used tape in a two-week backup rotation should be replaced approximately once a year.
Whereas a full backup backs up the entire hard drive an incremental backup backs up only those files that have changed since the last backup. So how does the backup software determine what files have changed? This is done through the archive bit.
The archive bit is a property of a file that is manipulated during the backup process. When a file is backed up the backup software clears the archive bit for that file. That tells the software that the file has not been changed. When a file is modified later by a user or by the system the archive bit is set again. That way the next time a backup is run, the backup software is able to tell which files have been changed since the last backup.
As we briefly mentioned, an incremental backup backs up only those files that have changed since the last backup of any type. It does this by clearing the archive bit for any file it backs up. That way only the files that have been modified (had their archive bit set) since the last backup will be backed up. If you have a server crash and need to perform a full restore you will need to have the last full backup tape plus all of the incremental backups that have occurred since that time. For example, you have a backup schedule where you perform a full backup on Friday and incremental backups Monday through Thursday. You have a server crash on Wednesday morning. To restore that drive you will need the previous Friday full backup plus Monday and Tuesday’s incremental backups. Incremental backups are ideal for daily backups because they typically backup far less information than a full backup and are therefore much quicker to perform. The disadvantage is that will be more tape shuffling than any other backup method as your data is spread across more tapes.
A differential backup is very similar to an incremental backup in that it only backs up data that has changed since previous backups. The difference though is that a differential backup backs up only those files that have changed since the last full backup. Whereas an incremental backup clears the archive bit from files it backs up, the differential backup leaves the archive bit set. This way if you need to restore a server the only tapes you will need are your last full backup and your last differential backup. The advantage to a differential backup is that like an incremental backup, only files that have been modified are backed up. The disadvantage as compared to incremental backups, however, is that each day that passes since the last full backup the data takes longer to backup. Since the differential backup does not clear the archive bit it will backup all of the files it backed up previously plus any new ones that have been modified. The differential backup is more advantageous than an incremental backup when it comes time to restore though, as you usually have less tape shuffling to do.
To use our backup schedule example above, a Wednesday differential backup will backup all of the files that were backed up on Monday and Tuesday as well as the files that have been modified on Wednesday. This is because it backs up all files that have changed since the last time a full backup was run.
A copy backup is a type of backup that doesn’t manipulate archive bites at all. It can be used to back up the entire hard drive like a full backup, or selected files and directories. So what’s the point? A copy backup is useful for doing a one-off full backup of a server without throwing off your backup schedule. For example say you made a major configuration change on Tuesday night, installing a new piece of hardware and a new application package on the server. You want to do a full backup before you do this as a “just in case” measure. If you do a standard full backup it will clear all of your archive bits, which will interfere with the incremental or differential backups that are usually based on the previous Friday’s full backup. So instead you do a copy backup which gets you what you want, a good full backup of the server, without interfering with your regular backup schedule.
This is a special kind of backup that isn’t a part of a normal backup software scheme. Some programs, most notably databases such as Microsoft SQL Server, have an internal backup program that will backup to the record or field level and dump the contents out to another file on the drive to be backed up by your regular backup software. These backups must be configured within the program itself and are specific to the program they come with.
How you setup your backup schedule will depend on a number of different elements within your organization. The table below summarizes the backup types with their advantages and disadvantages.
Backup Method What’s Backed Up Archive Bit Advantage Disadvantage
Full every file cleared on all files restore only needs one tape time consuming
Incremental files changed since last backup of any type cleared on files backed up fast restore requires full plus all previous incremental tapes
Differential files changed since last full backup unchanged faster than full, restore only requires full plus last differential tape backup time is slower as days pass since last full backup
Copy every file unchanged one off backup doesn’t affect backup schedule time consuming
As you can see, there are a lot of considerations to take into account when it comes to planning a backup strategy. Once you have determined what types of backups you are going to do, you will need to develop a schedule for your backups. The point of having backups is to be able to restore data in the least amount of time with the least amount of hassle. Having a schedule gives you a documented plan - you know exactly when data was backed up and how (what method). Not all restores will be full server recoveries, often users will request certain files to be restored that have been accidentally deleted or otherwise damaged (such as a user saving unwanted modifications and being unable to revert to the original version). Your schedule allows you to quickly determine which tape to restore from without having to spend a lot of time searching individual tapes for the file you are looking for.
A backup schedule should be setup in a way that provides the maximum amount of data protection with the easiest amount of administration. The best of these schedules is what is known as the Grandfather-Father-Son (GFS) method.
The GFS backup strategy is a method of maintaining backups on a daily, weekly, and monthly schedule. GFS backup schemes are based on a five-day or seven-day weekly schedule (depending on your organization), beginning any day (though typically Friday or Monday). A full backup is performed at least once a week. All other days, full, partial (incremental or differential), or no backups are performed. The daily incremental or differential backups are known as the Son. The Father is the last full backup in the week (the weekly backup). The Grandfather is the last full backup of the month (the monthly backup). The tables below show an example of a weekly backup schedule using full and differential backups.
5-day Backup Schedule
Sun Mon Tue Wed Thur Fri Sat
None Diff Diff Diff Diff Full None
7-day Backup Schedule
Sun Mon Tue Wed Thur Fri Sat
Diff Diff Diff Diff Diff Diff Full
In general with a GFS rotation, you can re-use daily tapes after four days (five-day schedule) or 6 days (seven-day schedule). Weekly tapes can be overwritten after five weeks have passed since it was last written to. Monthly media are saved throughout the year, and should be taken off-site for storage. The primary purpose of the GFS scheme is to suggest a minimum standard and consistent interval at which to rotate and retire the tapes. The following table shows an example GFS implementation over the course of two months, using a month with the 1st conveniently falling on a Friday, the day of the week of our full backup. While I use differential backups for the daily backups, the schedule would be the same using incremental backups instead.
Two month GFS rotation scheme on a five-day schedule
Month 1 (Full-W=Weekly backup, Full-M=Monthly backup)
Sun Mon Tue Wed Thur Fri Sat
1 2 3 4 5 6 7
None Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape5
Sun Mon Tue Wed Thur Fri Sat
8 9 10 11 12 13 14
None Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape6
Sun Mon Tue Wed Thur Fri Sat
15 16 17 18 19 20 21
None Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape7
Sun Mon Tue Wed Thur Fri Sat
22 23 24 25 26 27 28
None Diff Diff Diff Diff Full-M None
Tape1 Tape2 Tape3 Tape4 Tape8*
Sun Mon Tue
29 30 31
None Diff Diff
Tape1 Tape2
Month 2
Wed Thur Fri Sat
1 2 3 4
Diff Diff Full-W None
Tape4 Tape5 Tape9
Sun Mon Tue Wed Thur Fri Sat
5 6 7 8 9 10 11
Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape10
Sun Mon Tue Wed Thur Fri Sat
12 13 14 15 16 17 18
None Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape5
Sun Mon Tue Wed Thur Fri Sat
19 20 21 22 23 24 25
None Diff Diff Diff Diff Full-M None
Tape1 Tape2 Tape3 Tape4 Tape11*
Sun Mon Tue Wed Thur Fri Sat
26 27 28 29 30 1 2
None Diff Diff Diff Diff Full-W None
Tape1 Tape2 Tape3 Tape4 Tape6
From the above rotation you can quickly calculate that it will take a total of 21 tapes. There are four daily tapes (Sons) that are recycled (reused) weekly, five weekly tapes (Fathers) that are recycled after the fifth full weekly backup is complete, and 12 monthly tapes (Grandfathers) which are the last full backups of the month and are taken offsite. The need for storing monthly tapes leads us into our next topic.
You’ve been diligent in your backup practices, performing a perfect GFS tape rotation and keeping your tapes readily available on a shelf in your server room. What’s wrong with this picture? While your data is protected against a server failure, it is not in this scenario being protected against natural disaster. What if a fire broke out in your server room from a wiring fault? Your precious backup tapes just got melted down along with the server, and all data is lost. A wise investment for the server room is a heavy duty fireproof safe for storing tapes. That way, if you have a fire, your tapes should be able to survive. But what if you have flooding? Even if you don’t live in a flood plain, stranger things have happened than one of the restrooms on the floor above you having a busted pipe that causes water to rain down through the ceiling tiles. Fireproof safes aren’t waterproof, and while your server shorts out and crashes your tapes are going under water. What if your building is hit by a tornado? The only way to really ensure that your data is protected is to store copies of it offsite. In the GFS tape rotation the monthly backups, or Grandfathers, are taken offsite for storage. If you have a natural disaster that strikes the office building and server room and wipes out you data, you have the ace in the hole so to speak and can probably recover enough of the company data to at least be able to carry on with business. In addition to protecting against natural disaster, storing tapes offsite also protects against the possibility of losing all corporate data to theft or vandalism.
In the last two articles we have looked at backup strategies as part of an overall disaster recovery plan. While not the most glamorous part of a system administrator’s job, protection of data is arguably the most important function of an administrator. By determining your backup needs and then devising a schedule to meet those needs, you’ll ensure that if disaster does ever strike you will be able to recover from it.
Question or Comments? Will can be reached at WWillis@Transcender.com