When Disaster Strikes – Having a Written Plan

Will Willis

 

The recent events in New York and Washington D.C. have resulted in a lot of fallout in the United States and in the world as a whole. In addition to the changing of our lives and the ways of thinking about our personal safety, companies are now being faced with the reality that their businesses could be physically affected by natural or man-made disasters. We saw this to a small degree with the earthquakes that hit the state of Washington earlier this year and downtown Ft. Worth, Texas last year, which caused a lot of damage. However, while a lot of companies thought about what they would do if disaster struck, the likelihood of such natural disasters in many areas is remote. Therefore, the sense of urgency wasn’t necessarily that high. The unfortunate tragedy of the terrorist attacks though has left many companies trying to figure out what they would do in the event of a man-made disaster, and few people feel completely safe from potential terrorist acts. As IT professionals, we are being called upon to not only provide safety for the intellectual property (data) of our companies, but also to figure out how we would get people working again in the aftermath of a disaster that caused catastrophic damage to our building and network infrastructure. This time around we will look at what the components of a disaster recovery plan, and how to go about implementing them.

 

Cooperation among Business Units

First of all, it is important to note that while “disaster recovery” is often lumped into the role of the IT department, the logistics of truly resuming business after a disaster will require the input of other business units as well. As you devise your recovery plan, you need to take into account that there is more to resuming the business than just getting the network infrastructure and data services back online. You also need to have a contingency plan that covers where you will be operating in the event that you cannot immediately move back into your current location. Securing a new temporary or permanent location will not fall under the responsibility of IT, however, the IT department needs to be involved in the process in order to make sure the new location will accommodate the needs of the company. Consider forming a small task force made up of individuals from a few different business units that are most representative of the company as a whole. You will be able to get multiple perspectives on how to best prioritize the recovery plan.

 

Identify Business Functions

Once you have your task force together and begin meeting, the next step is to identify what “recovery” means in terms of business functions. Through this process, you can determine the things that are most important, and what is required to get each business unit back in operation. A sample of a worksheet might be something like this:

 

Department

Financial Services

  • Required Functions
    • Process payroll
    • Receive faxed time sheets
    • Fax invoices
    • Pay and track bills (accounts payable) through accounting software
    • Process payments (accounts receivable) through accounting software
    • Cut checks
    • Receive telephone calls from clients and customers
    • Etc.

 

Sales

  • Required Functions
    • Place and receive sales calls (telephones)
    • Send and receive email
    • Input, track, and manage orders through order database
    • Etc.

 

You can even break this down further into categories that prioritize the functions based on how critical they are. Once you have established the business functions, your task force can identify the “enablers” of those functions. For the IT department, when you look at the list of business functions such as those above under Financial Services, you can evaluate them in terms of the technology and infrastructure that enables the department to work. Processing payroll means you have to have the server available that stores the payroll database software. If you print checks in-house, you will need to have a printer available that is capable of printing checks (usually that means one that takes special toner, which is a detail that could easily be overlooked). If you outsource the printing of checks, you need to make sure you have a way of getting the payroll data to the vendor. That might mean ensuring you have a PC available with a modem and phone line that can perform this task.

 

Essentially here you are defining the physical resources required to bring your business back online. The resources include capital costs such as network hardware (servers, routers, hubs, PCs, printers, etc.) as well as the people necessary to carry out the recovery plan and to resume the business.

 

Vulnerability Assessment

Once you have determined the requirements for bringing your business back online, it is important to analyze your current situation to ensure that a recovery will be feasible. From an IT standpoint, that includes evaluating backup strategies first and foremost. Off-site backup storage is a must. In my next article I’ll discuss backup strategies in more detail, but for now I’ll simply say that your strategy has to include regular backups and periodically taking a copy of the data off-site. That will ensure that if your building does get destroyed, all of your backups aren’t destroyed along with it.

 

In addition to backups, you also have to evaluate current security practices as they relate to the physical security of your servers and PCs, network access policies and permissions, antivirus strategies, email security, and telecom equipment. A “disaster” could possibly come in the form of a security breach against one or more of your information systems. In any case, your goal is to identify any potential infiltration points and how you will react in the case that a security breach does take place resulting in data loss.

 

Making Arrangements

Since part of your disaster recovery plan will involve the contingency of a physical catastrophe to your location and equipment, the task force will need to make arrangements with vendors to be able to rent, lease, or buy replacement equipment on very short notice. You’ll also need to possibly make arrangements with real estate companies to acquire new facilities if you are unable to move back into your existing building right away. The details of where and how you’ll move should be taken care of before the need arises. If you have multiple locations, your contingency plan might be to move operations to one of the other locations. If you only have one location, you will have to be able to get into new office space as quickly as possible following the disaster.

 

When the tornados tore through downtown Ft. Worth last year, a colleague of mine had the arduous task of getting his company’s data services back online while waiting for their building to be repaired so they could re-inhabit it. With little network infrastructure left in place, he had to help devise a network connecting small locations all around the city through modem lines in order to re-establish communications. It wasn’t pretty, but they managed to limp along for a few weeks until they could get back into their regular office.

 

Testing the Plan

Once you have accumulated the data about your business and determined the steps you will take to recover from various levels of disaster, you have to test your plan. The task force should organize a test disaster recovery scenario that will not only put the plan to test, but will also show weaknesses and areas for improvement. It is doubtful you’ll get the plan 100% right the first time. Through testing you can improve the efficiency of the disaster recovery plan, and eliminate any “gotchas” that came up that you didn’t account for the first time through.

 

Revisiting the Plan

It is especially important that the process doesn’t end after the disaster recovery plan is devised by the task force, tested, and accepted by management. In a future article we will discuss change management, which similar to what it sounds like is the process of managing change in an organization. As the company grows equipment changes, software is revised or replaced, and other changes take place. Change can cause your disaster recovery plans to become out of date and cause confusion if you should ever have to put the plan into action. The task force should continue to meet periodically to determine if the plan is still adequate, and to make any adjustments that are necessary. By keeping the plan up to date, you don’t run the risk of the plan failing when you need it most.

 

There’s not a lot of concrete detail in this article for one key reason, the disaster recovery plan of every organization will be unique to that organization. If you follow the principles laid out here, however, you should be on the right track to developing a disaster recovery plan that will ensure that no matter what the catastrophe, your company will be able to recover and carry on the business.

 

Questions or Comments? Will can be reached at WWillis@Transcender.com